dptech-corp / uni-fold Goto Github PK

View Code? Open in Web Editor NEW

348.0 348.0 62.0 8.69 MB

An open-source platform for developing protein models beyond AlphaFold.

Home Page: https://doi.org/10.1101/2022.08.04.502811

License: Apache License 2.0

Dockerfile 0.12% Shell 3.90% Python 94.09% Jupyter Notebook 1.89%

alphafold alphafold-multimer deep-learning protein-structure pytorch

uni-fold's People

Contributors

Stargazers

Watchers

Forkers

techthiyanes baozcwj lql341 scairesearch rnaimehaom yaoyinying superxiang animesh stjordanis gordon5-ai ftry yuchaoli tigerkey10 marcbesseling danpf dongcf fazziekey avilella labstructbioinf mihirdate guruace tpan1039-ui yakomaxa knowledgehacker zillaru robotcator zhangxiang15 zhiyuanchen shicheng-guo guoxiawang ziyaoli jasoncromeron chaunceydust somous-jhzhao minghao2016 teslacool refraintc dingye18 xluo233 truatpasteurdotfr felixchina2000 dingquanyu pkuterran biogeeker tonywork sg-chen hypnopump qqlaoxia lyndonlens juanrong zarrathustra findbug2019 yamule seoklab engelberger jinyuansun hejunhong1107 hazho hjanime sevilaygulesen germanolira pkufjh

uni-fold's Issues

finetuning unifold

Hi all! I'd like to finetune unifold multimer with new cases. In the git readme file, it mentions that I need a data structure similar to the one in example_data, with features pickle files and labels (please correct me if I'm wrong). However, it is not clear to me how to generate this data structure. Could you provide additional information on that, please?
Thank you!

Why is dim hard-coded in collate_fn in UnifoldMultimerDataset

Hi,

Sorry I wonder why is dim hard coded to be 1 in the collate_fn here:

Uni-Fold/unifold/dataset.py

Line 383 in e9b3fec

return data_utils.collate_dict(samples, dim=1)

I think it should be adjusted according to the shape of the input matrices? For example, collating 2 or more 'aatype' matrices on axis 1 is fine because it is a 2D matrix and axis 1 corresponds to the number of residues. However, collating 'template_all_atom_positions' on axis 1 won't work because in this case, axis 2 corresponds to the number of residues.

I suppose collate_fn should adjust the dim accordingly but I'm not sure if I understand correctly? I'd really appreciate it if you'd give me some advice. Thanks.

Using Template PDB files

Hi,

In AlphaFold you have a the possibility to provide a template (or decoy) protein structure, this way skipping the MSA search, leading to faster predictions. Is this with Uni-Fold also possible, or planned to be implemented? It would be of great use in some applications.

training error

after i download the training dataset and start training. there is an error:
Traceback (most recent call last):
File "/home/wayne/miniconda3/envs/unifold/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/wayne/miniconda3/envs/unifold/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/wayne/.vscode-server/extensions/ms-python.python-2022.20.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/main.py", line 39, in
cli.main()
File "/home/wayne/.vscode-server/extensions/ms-python.python-2022.20.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 430, in main
run()
File "/home/wayne/.vscode-server/extensions/ms-python.python-2022.20.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 284, in run_file
runpy.run_path(target, run_name="main")
File "/home/wayne/.vscode-server/extensions/ms-python.python-2022.20.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 321, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/wayne/.vscode-server/extensions/ms-python.python-2022.20.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 135, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/wayne/.vscode-server/extensions/ms-python.python-2022.20.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 124, in _run_code
exec(code, run_globals)
File "/home/wayne/project/folding/uni/train.py", line 431, in
cli_main()
File "/home/wayne/project/folding/uni/train.py", line 427, in cli_main
distributed_utils.call_main(args, main)
File "/home/wayne/miniconda3/envs/unifold/lib/python3.9/site-packages/unicore-0.0.1-py3.9-linux-x86_64.egg/unicore/distributed/utils.py", line 193, in call_main
main(args, **kwargs)
File "/home/wayne/project/folding/uni/train.py", line 79, in main
task.load_dataset(valid_sub_split, combine=False, epoch=1)
File "/home/wayne/project/folding/uni/unifold/task.py", line 72, in load_dataset
dataset = data_class(
File "/home/wayne/project/folding/uni/unifold/dataset.py", line 257, in init
sample_weight = load_json(
File "/home/wayne/project/folding/uni/unifold/dataset.py", line 255, in load_json
return json.load(open(filename, "r"))
FileNotFoundError: [Errno 2] No such file or directory: '/media/HDD/dataset/protein1/modelscope/hub/datasets/downloads/DPTech/Uni-Fold-Data/master/eval_sample_weight.json'

Rewrite Config

Current config is a mono-config which contains all information no matter if is needed.
It is more desired to rewrite to a hierarchical structure with proper inheritance, so that we could get only essential configs.

TODOS:
[x] Rewrite config with chanfig
[ ] Only select essential configs
[ ] Rewrite model_config

How to read the downloaded Uni-Fold dataset?

Thank you for providing an excellent library.
I successfully download the Uni-Fold dataset using modelscope and I have a question about the dataset.
After downloading and extracting the dataset, the extracted file names are like "134ffc6f87b3bcf6c8b2a048aeec7dd2ec4b18f4fc015c024e16077f7231fc83".
How can I read these kinds of files? The example file in the original Uni-Fold dataset is a pickle file but the files I download through modelscope seem very different.

想问一下训练资源以及时间

利用这个代码，我手上有8张A100 (80G)，从头开始训练alphafold-multimer大概需要多久呢？

UniFold Symmetry

create this issue for tracking the code merge of Uni-Fold Symmetry.

Readme (news, citation, etc)
Inference Code
Colab example
Training Code

cc @ZiyaoLi , @BaozCWJ

trying installation on a computer with an NVIDIA RTX A4000 16GB VRAM

Hi all,

I am trying an installation of this repo on a computer with an NVIDIA RTX A4000 16GB VRAM.

The base OS is an Ubuntu 22.04, and I've successfully installed and ran the non-docker version of Alphafold2 following the instructions in this repo: https://github.com/kalininalab/alphafold_non_docker

I've attempted the installation of Uni-Fold, but got stuck at a point where the python scripts attempt to load the unicore libraries. I went to the Uni-Core repo and attempted an non-docker installation, but it complained about the differing versions of torch. See below:


~/Uni-Core$ pip install .                                                                                                                                                                                                       
Defaulting to user installation because normal site-packages is not writeable                                                                                                                                                                
Processing /home/user/Uni-Core                                                                                                                                                                                                           
  Preparing metadata (setup.py) ... error                                                                                                                                                                                                    
  error: subprocess-exited-with-error                                                                                 
                                                                                                                      
  ￃﾃﾗ python setup.py egg_info did not run successfully.                                                                
  ￃﾢﾔﾂ exit code: 1                                                                                                                                                                                                                             
  ￃﾢﾕￂﾰￃﾢﾔﾀ> [22 lines of output]                                                                                                                                                                                                                   
      Traceback (most recent call last):                                                                                                                                                                                                     
        File "<string>", line 2, in <module>                                                                                                                                                                                                 
        File "<pip-setuptools-caller>", line 34, in <module>                                                                                                                                                                                 
        File "/home/user/Uni-Core/setup.py", line 105, in <module>                                                                                                                                                                       
          check_cuda_torch_binary_vs_bare_metal(torch.utils.cpp_extension.CUDA_HOME)                                                                                                                                                         
        File "/home/user/Uni-Core/setup.py", line 87, in check_cuda_torch_binary_vs_bare_metal                                                                                                                                           
          torch_binary_major = torch.version.cuda.split(".")[0]                                                                                                                                                                              
      AttributeError: 'NoneType' object has no attribute 'split'                                                      
      No CUDA runtime is found, using CUDA_HOME='/usr'                                                                
                                                                                                                      
      Warning: Torch did not find available GPUs on this system.                                                      
       If your intention is to cross-compile, this is not an error.                                                   
      By default, it will cross-compile for Volta (compute capability 7.0), Turing (compute capability 7.5),          
      and, if the CUDA version is >= 11.0, Ampere (compute capability 8.0).                                                                                                                                                                  
      If you wish to cross-compile for a single specific architecture,                                                
      export TORCH_CUDA_ARCH_LIST="compute capability" before running setup.py.                                       
                                                                                                                      
                                                                                                                      
                                                                                                                      
      torch.__version__  = 1.8.0a0                                                                                    
                                                                                                                                                                                                                                             
                                                                                                                                                                                                                                             
      [end of output]                                                                                                 
                                                                                                                      
  note: This error originates from a subprocess, and is likely not a problem with pip.                                
error: metadata-generation-failed                                                                                     
                                                                                                                                                                                                                                             
ￃﾃﾗ Encountered error while generating package metadata.                                                                                                                                                                                       
ￃﾢﾕￂﾰￃﾢﾔﾀ> See above for output.

The instructions mention the docker route:

Then, you can create and attach into the docker container, and clone & install unifold.

Would it be possible to add more explicit docker instructions on how to achieve the docker installation for both Uni-Fold and Uni-Core?

Given that the GPU I intend to use is an Ampere platform GPU, should I be concerned that the version of torch in the docker containers may not be compatible with my GPU? Thanks in advance.

loading state_dict

I run the code in Running Uni-Fold( I change some paths)as this:
bash run_unifold.sh /home/input_fasta/1.fasta /home/output /path/to/database/directory 2020-05-01 multimer_ft /home/Uni-Fold/all_model/multimer.unifold.pt
model name is : multimer_ft
model parameters file is:multimer.unifold.pt(in section 'Downloading the pre-trained model parameters')

but I meet this bug:

could u help me?

CUDA OOM error when inference long protein sequence

Hi, thanks for sharing the work. I tried to run a long protein sequence(4022AA) with the provided monomer pretrained model and the model_name is "model_2_ft". I have tried to reduce the chunk size and switch to bf16 but it still failed. My machine is A100 40G. And the error log is as below.

{'aatype': torch.Size([1, 1, 4022]), 'residue_index': torch.Size([1, 1, 4022]), 'seq_length': torch.Size([1, 1]), 'template_aatype': torch.Size([1, 1, 4, 4022]), 'template_all_atom_mask': torch.Size([1, 1, 4, 4022, 37]), 'template_all_atom_positions': torch.Size([1, 1, 4, 4022, 37, 3]), 'num_recycling_iters': torch.Size([1, 1]), 'is_distillation': torch.Size([4, 1]), 'seq_mask': torch.Size([1, 1, 4022]), 'msa_mask': torch.Size([4, 1, 508, 4022]), 'msa_row_mask': torch.Size([4, 1, 508]), 'template_mask': torch.Size([1, 1, 4]), 'template_pseudo_beta': torch.Size([1, 1, 4, 4022, 3]), 'template_pseudo_beta_mask': torch.Size([1, 1, 4, 4022]), 'template_torsion_angles_sin_cos': torch.Size([1, 1, 4, 4022, 7, 2]), 'template_alt_torsion_angles_sin_cos': torch.Size([1, 1, 4, 4022, 7, 2]), 'template_torsion_angles_mask': torch.Size([1, 1, 4, 4022, 7]), 'residx_atom14_to_atom37': torch.Size([1, 1, 4022, 14]), 'residx_atom37_to_atom14': torch.Size([1, 1, 4022, 37]), 'atom14_atom_exists': torch.Size([1, 1, 4022, 14]), 'atom37_atom_exists': torch.Size([1, 1, 4022, 37]), 'target_feat': torch.Size([1, 1, 4022, 22]), 'extra_msa': torch.Size([4, 1, 1024, 4022]), 'extra_msa_mask': torch.Size([4, 1, 1024, 4022]), 'extra_msa_row_mask': torch.Size([4, 1, 1024]), 'bert_mask': torch.Size([4, 1, 508, 4022]), 'true_msa': torch.Size([4, 1, 508, 4022]), 'extra_msa_has_deletion': torch.Size([4, 1, 1024, 4022]), 'extra_msa_deletion_value': torch.Size([4, 1, 1024, 4022]), 'msa_feat': torch.Size([4, 1, 508, 4022, 49])}
Traceback (most recent call last):
  File "unifold/inference.py", line 269, in <module>
    main(args)
  File "unifold/inference.py", line 143, in main
    raw_out = model(batch)
  File "/root/paddlejob/workspace/env_run/openfold/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/paddlejob/workspace/env_run/Uni-Fold/unifold/modules/alphafold.py", line 444, in forward
    num_ensembles=num_ensembles,
  File "/root/paddlejob/workspace/env_run/Uni-Fold/unifold/modules/alphafold.py", line 365, in iteration_evoformer_structure_module
    feats, m_1_prev, z_prev, x_prev
  File "/root/paddlejob/workspace/env_run/Uni-Fold/unifold/modules/alphafold.py", line 305, in iteration_evoformer
    templ_dim=-4,
  File "/root/paddlejob/workspace/env_run/Uni-Fold/unifold/modules/alphafold.py", line 192, in embed_templates_pair
    t = self.embed_templates_pair_core(batch, z, pair_mask, tri_start_attn_mask, tri_end_attn_mask, templ_dim, multichain_mask_2d)
  File "/root/paddlejob/workspace/env_run/Uni-Fold/unifold/modules/alphafold.py", line 161, in embed_templates_pair_core
    **self.config.template.distogram,
  File "/root/paddlejob/workspace/env_run/Uni-Fold/unifold/modules/featurization.py", line 123, in build_template_pair_feat
    act = torch.cat(to_concat, dim=-1)
RuntimeError: CUDA out of memory. Tried to allocate 21.21 GiB (GPU 0; 39.59 GiB total capacity; 24.58 GiB already allocated; 13.62 GiB free; 24.82 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Thanks

how to get pdb_labels training dataset

halo, i can get the pdb_features dataset from the script in Uni-Fold-jax program, but i can not find any code about gererating the label data, I’d appreciate some help.

Flash-Attention

Missing import in Colab

The provided Colab will break at the generate MSAs step, due to the libmsym package missing.

Installing the missing package via pip3 solves the issue for me.

Might confuse someone through.

could unifold support features.pkl from other method like colabfold?

self-distillation

could u mind showing the code about self-distillation referred in Uni-Fold paper?

BladeDISC estimated inference improvement time?

Hello,
Just curious, in your readme you mention that BladeDISC models can be used to speedup inference time, do you have any examples of how much of an improvement it gives? Trying to see if it is worth the trouble of installing it.

thanks!

The weight of chain_centre_mass loss

Hi,
I am a bit confused about the weight you set for chain_centre_mass loss in the final loss, could you please tell me what value you assign to it when training?(I checked the config.py, the default is 0.0)

dataset down load error

when i download the training dataset, i got the error: requests.exceptions.ConnectionError: HTTPConnectionPool(host='www.modelscope.cn', port=80): Max retries exceeded with url: /api/v1/datasets/DPTech/Uni-Fold-Data/oss/tree/?MaxLimit=-1&Revision=master&Recursive=True&FilterDir=True (Caused by ReadTimeoutError("HTTPConnectionPool(host='www.modelscope.cn', port=80): Read timed out. (read timeout=60)"))

Question about "{}_cluster_size.json"

Thank you for sharing this nice repository!

When preparing the data for fine-tuning, I have a question about "{}_cluster_size.json".
This file is necessary to generate "{}/{}_train_sample_weight.json" file, but I couldn't find any information about "{}_cluster_size.json".
Could you please explain what it is and how I can create this file?

Can unifold multimer be trained with batch_size higher than 1?

Hi,

Sorry again but I wonder if unifold and unifold_multimer can be trained with batch_size higher than 1? I saw in the example script per-sample-clip-norm is not set to 0 and if it's not 0 then batch_size should be fixed to 1. When I set per-sample-clip-norm to 0 and increased the batch_size to more than 1, then the collating function reported an error when it stacked 'asym_len'. 'asym_len' was a 1X2 vector in a dimer but it became 1X4 in a tetramer thus torch.stack() gave this error.

This makes me wonder maybe unifold was designed only to be trained or finetuned with batch_size=1?

I look forward to your advice. Thanks a lot

Colab error and how many recycles possible?

Hi, I got this error from attempting to run the colab:

I also wanted to ask how many recycles would be possible to run with two sequences, one 150aa long and the other 250aa long before running out of memory in a Colab (not the pro one).

Thanks

docker run -d -it --gpus all --net=host --name unifold dptechnology/unifold:latest-pytorch1.11.0-cuda11.3 not run

Entries in eval_multi_label.json and eval_sample_weight.json do not exist in pdb_uniprots

Hi,

I've managed to download the full datasets using rclone but none of the entries listed in eval_multi_label.json and eval_sample_weight.json is in pdb_uniprots/

For example, T1029.label.pkl.gz and T1029.feature.pkl.gz exist in their corresponding folder but there is no record of T1029 in the uniprot folder thus the programme crashed at evaluation steps

Errors in train_multi_label.json

Hi,

I wonder how you generated the train_multi_label.json but there seem to be errors in this file. For example, 7a6o has two chains : A and B in pdb but in this file it's labeled as 7a6o_AAA and 7a6o_BBB. This and other similar mislabellings have given me errors. Could you maybe upload the script that generated this json file? Thanks.

Questions regarding Uni-Fold multimer evaluation

Thanks for presenting the great work of Uni-Fold on Protein Structure Prediction. I have a few questions regarding the multimer evaluation part.

As for multimer evaluation, it's said in the paper that you merge all chains into one before scoring. Do you predict the multimers with all chains and then remove the TER symbol in the output pdb files before using the TMscore/lDDT tool? Or do you just use the command $ TMscore -c model.pdb native.pdb with -ter <= 1? Would you share more details or the scripts of the evaluation part since I haven't found it within the GitHub repo?
Since the order of different chains might be a problem in the predicted multimers, it's said in the paper that you iterate over all possible permutation alignments during evaluation. Would you share any related codes of this part so that I can align my experiment results with yours?
I think It's common to use DockQ during the multimer evaluation. May I know any reason for not using this metric besides leading to confusion which is said in the paper?

Parameter download links wrong

Hi,

I tried to download the database and parameters fresh and ran into problems with the parameters. The download_all script still downloads the alphafold parameters.
The download instructions in the README are pointing to the wrong files. The normal unifold setup points to the symmetry parameter download and the symmetry instruction points to the normal unifold parameter.

All the best,
Dominik

Failed to load Uni-Fold checkpoints

i use run_unifold.sh get some erro,how to check it :

start to load params /data/data/unifolddata/params/monomer.unifold.pt
Traceback (most recent call last):
File "unifold/inference.py", line 266, in
main(args)
File "unifold/inference.py", line 91, in main
model.load_state_dict(state_dict)
File "/data/miniconda3/envs/unifold/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1604, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for AlphaFold:
Missing key(s) in state_dict: "aux_heads.experimentally_resolved.linear.weight", "aux_heads.experimentally_resolved.linear.bias".

run_uf_symmetry failed

When running to the inference stage of neural network, it shows:
''FileNotFoundError: [Errno 2] No such file or directory: '/Uni-Fold/predicted_structure/10GS/chains.txt'”
Did homo_search.py forget to generate some files?

how can I download the training data of multimer only?

I want to train the multimer from scratch, due to my disk memory limitation, I want to download the multimer training data only. Are there any scripts to achieve this goal？

Is the pdb data open-sourced?

Hi. Do you open-source the training data (single-chain/ complex protein pdbs)?

questions regarding cuda kernel and inference argument `model_name`

If I want to have the cuda kernels installed, I have to use your docker image? If I don't want the cuda kernels, can I skip the docker part and directly do pip install -e . ( I already have a cuda-torch compatible environment)?
what does the argument model_name mean? The doc specify model name, must be consistent with model parameters confuses me. I find many so-called models and don't know what they represent and how to be consistent. I download the unifold ckpt from your URL https://uni-fold.dp.tech/unifold_params_2022-08-01.tar.gz. There are two ckpts, one for monomer, one for multimer. I want to use monomer, which model_name does monomer ckpt corresponds to?

Best,
Zhangzhi

Bug in jupyter notebook cell "Process features for Uni-fold prediction"

The unifold.ipynb jupyter notebook hits an error when it reaches this function:

(
  unpaired_msa,
  paired_msa,
  template_results,
) = get_msa_and_templates(
  target_id,
  unique_sequences,
  result_dir=result_dir,
  msa_mode=msa_mode,
  use_templates=use_templates,
  pair_mode = "unpaired+paired",
  homooligomers_num = homooligomers_num
)

Just a heads up that I was able to fix it by removing the line pair_mode = "unpaired+paired",.

evaluation dataset

is it possible to download your evaluation dataset for uni fold? is the evaluation code will be released? thank you

OOM in multimer

Hi,

I'm running into OOM errors in the template module during finetuning (in training and validation) of Uni-Fold-multimer. It happens with multimer_ft and multimer_af2. All targets have at most 3500AA (combined). I'm using a A100 80GB. BF16 is enabled. I'm even more confused that it also happens during training, where the cropping should limit the peak memory consumption. Is there anything I have to keep in mind? I see that you do chunking during inference shouldn't this be also done in eval? Was the 1536AA limit you imposed in the paper tailored to 40GB VRAM? Thanks in advance!

The OOM error occurs here: unifold/modules/template.py", line 229, in forward
self.tri_att_start(

more specifically in attn = torch.matmul(q, k.transpose(-1, -2)) in attentions.py", line 69

RuntimeError: CUDA out of memory. Tried to allocate 75.09 GiB (GPU 0; 79.17 GiB total capacity; 22.66 GiB already allocated; 51.26 GiB free; 26.60 GiB reserved in total by PyTorch)

illegal character 0

This is my output from docker container.

root@c305fc365212:/home/unifold# bash run_unifold.sh /home/unifold/T1052.fasta /home/unifold/output /home/data 2020-05-01 model_2_ft /home/unifold/params/monomer.unifold.pt
Starting homogeneous searching...
I0808 22:00:42.066362 139658661553856 templates.py:945] Using precomputed obsolete pdbs /home/data/pdb_mmcif/obsolete.dat.
I0808 22:00:42.070091 139658661553856 homo_search.py:160] searching homogeneous Sequences & structures for unifold...
I0808 22:00:42.070847 139658661553856 jackhmmer.py:140] Launching subprocess "/usr/bin/jackhmmer -o /dev/null -A /tmp/tmpezbgmjgz/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /home/unifold/T1052.fasta /home/data/uniref90/uniref90.fasta"
I0808 22:00:42.087588 139658661553856 utils.py:36] Started Jackhmmer (uniref90.fasta) query
I0808 22:06:28.933336 139658661553856 utils.py:40] Finished Jackhmmer (uniref90.fasta) query in 346.845 seconds
I0808 22:06:28.936954 139658661553856 jackhmmer.py:140] Launching subprocess "/usr/bin/jackhmmer -o /dev/null -A /tmp/tmp2phimc6z/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /home/unifold/T1052.fasta /home/data/mgnify/mgy_clusters_2018_12.fa"
I0808 22:06:28.955174 139658661553856 utils.py:36] Started Jackhmmer (mgy_clusters_2018_12.fa) query
I0808 22:12:37.104840 139658661553856 utils.py:40] Finished Jackhmmer (mgy_clusters_2018_12.fa) query in 368.149 seconds
I0808 22:12:37.155930 139658661553856 hmmbuild.py:121] Launching subprocess ['/usr/bin/hmmbuild', '--hand', '--amino', '/tmp/tmpvx3aub12/output.hmm', '/tmp/tmpvx3aub12/query.msa']
I0808 22:12:37.172153 139658661553856 utils.py:36] Started hmmbuild query
I0808 22:12:37.564175 139658661553856 hmmbuild.py:129] hmmbuild stdout:

hmmbuild :: profile HMM construction from multiple sequence alignments

HMMER 3.3 (Nov 2019); http://hmmer.org/

Freely distributed under the BSD open source license.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

input alignment file: /tmp/tmpvx3aub12/query.msa

output HMM file: /tmp/tmpvx3aub12/output.hmm

input alignment is asserted as: protein

model architecture construction: hand-specified by RF annotation

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

idx name nseq alen mlen eff_nseq re/pos description

#---- -------------------- ----- ----- ----- -------- ------ -----------
1 query 198 1290 832 5.75 0.590

CPU time: 0.38u 0.00s 00:00:00.38 Elapsed: 00:00:00.39

stderr:

I0808 22:12:37.564431 139658661553856 utils.py:40] Finished hmmbuild query in 0.392 seconds
I0808 22:12:37.566694 139658661553856 hmmsearch.py:117] Launching sub-process ['/usr/bin/hmmsearch', '--noali', '--cpu', '8', '--F1', '0.1', '--F2', '0.1', '--F3', '0.1', '--incE', '100', '-E', '100', '--domE', '100', '--incdomE', '100', '-A', '/tmp/tmpil8uqp24/output.sto', '/tmp/tmpil8uqp24/query.hmm', '/home/data/pdb_seqres/pdb_seqres.txt']
I0808 22:12:37.579384 139658661553856 utils.py:36] Started hmmsearch (pdb_seqres.txt) query
I0808 22:12:49.915640 139658661553856 utils.py:40] Finished hmmsearch (pdb_seqres.txt) query in 12.336 seconds
Traceback (most recent call last):
File "unifold/homo_search.py", line 306, in
app.run(main)
File "/opt/conda/lib/python3.8/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/opt/conda/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "unifold/homo_search.py", line 284, in main
generate_pkl_features(
File "unifold/homo_search.py", line 176, in generate_pkl_features
feature_dict = data_pipeline.process(
File "/home/unifold/unifold/msa/pipeline.py", line 193, in process
pdb_templates_result = self.template_searcher.query(msa_for_templates)
File "/home/unifold/unifold/msa/tools/hmmsearch.py", line 89, in query
return self.query_with_hmm(hmm)
File "/home/unifold/unifold/msa/tools/hmmsearch.py", line 128, in query_with_hmm
raise RuntimeError(
RuntimeError: hmmsearch failed:
stdout:

hmmsearch :: search profile(s) against a sequence database

HMMER 3.3 (Nov 2019); http://hmmer.org/

Freely distributed under the BSD open source license.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

query HMM file: /tmp/tmpil8uqp24/query.hmm

target sequence database: /home/data/pdb_seqres/pdb_seqres.txt

MSA of all hits saved to file: /tmp/tmpil8uqp24/output.sto

show alignments in output: no

sequence reporting threshold: E-value <= 100

domain reporting threshold: E-value <= 100

sequence inclusion threshold: E-value <= 100

domain inclusion threshold: E-value <= 100

MSV filter P threshold: <= 0.1

Vit filter P threshold: <= 0.1

Fwd filter P threshold: <= 0.1

number of worker threads: 8

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query: query [M=832]

stderr:
Parse failed (sequence file /home/data/pdb_seqres/pdb_seqres.txt):
Line 1360234: illegal character 0

Starting prediction...
start to load params /home/unifold/params/monomer.unifold.pt
start to predict unifold
Traceback (most recent call last):
File "unifold/inference.py", line 232, in
main(args)
File "unifold/inference.py", line 103, in main
batch = load_feature_for_one_target(
File "unifold/inference.py", line 46, in load_feature_for_one_target
batch, _ = load_and_process(
File "/home/unifold/unifold/dataset.py", line 234, in load_and_process
features, labels = load(**load_kwargs, is_monomer=is_monomer)
File "/home/unifold/unifold/dataset.py", line 129, in load
all_chain_features = [
File "/home/unifold/unifold/dataset.py", line 130, in
load_single_feature(s, monomer_feature_dir, uniprot_msa_dir, is_monomer)
File "/home/unifold/unifold/data/utils.py", line 33, in wrapper
return copy_lib.copy(cached_func(*args, **kwargs))
File "/home/unifold/unifold/dataset.py", line 72, in load_single_feature
monomer_feature = utils.load_pickle(
File "/home/unifold/unifold/data/utils.py", line 33, in wrapper
return copy_lib.copy(cached_func(*args, **kwargs))
File "/home/unifold/unifold/data/utils.py", line 67, in load_pickle
ret = load(path)
File "/home/unifold/unifold/data/utils.py", line 64, in load
with open_fn(path, "rb") as f:
File "/opt/conda/lib/python3.8/gzip.py", line 58, in open
binary_file = GzipFile(filename, gz_mode, compresslevel)
File "/opt/conda/lib/python3.8/gzip.py", line 173, in init
fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
FileNotFoundError: [Errno 2] No such file or directory: '/home/unifold/output/unifold/A.feature.pkl.gz'

convert_unifold_to_alphafold.py ?

Is there a convert_unifold_to_alphafold.py script? 😄

Python library to compare actual and predicted structure

Hi, I was wondering which python library did you use to compare the predicted and actual structure.

something similar to this.

Thanks

HMMSearch Failure reading pdb_seqres.txt for Inference

The error appears to take issue with a specific line in the file but I download it using the provided scripts from AlphaFold2.

Traceback (most recent call last):
  File "unifold/homo_search.py", line 313, in <module>
    app.run(main)
  File "/opt/conda/lib/python3.8/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/opt/conda/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "unifold/homo_search.py", line 291, in main
    generate_pkl_features(
  File "unifold/homo_search.py", line 177, in generate_pkl_features
    feature_dict = data_pipeline.process(
  File "/Uni-Fold/unifold/msa/pipeline.py", line 193, in process
    pdb_templates_result = self.template_searcher.query(msa_for_templates)
  File "/Uni-Fold/unifold/msa/tools/hmmsearch.py", line 89, in query
    return self.query_with_hmm(hmm)
  File "/Uni-Fold/unifold/msa/tools/hmmsearch.py", line 128, in query_with_hmm
    raise RuntimeError(
RuntimeError: hmmsearch failed:
stdout:
# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3 (Nov 2019); http://hmmer.org/
# Copyright (C) 2019 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  /tmp/tmprvyby0zh/query.hmm
# target sequence database:        /database/pdb_seqres/pdb_seqres.txt
# MSA of all hits saved to file:   /tmp/tmprvyby0zh/output.sto
# show alignments in output:       no
# sequence reporting threshold:    E-value <= 100
# domain reporting threshold:      E-value <= 100
# sequence inclusion threshold:    E-value <= 100
# domain inclusion threshold:      E-value <= 100
# MSV filter P threshold:       <= 0.1
# Vit filter P threshold:       <= 0.1
# Fwd filter P threshold:       <= 0.1
# number of worker threads:        8
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       query  [M=117]


stderr:
Parse failed (sequence file /database/pdb_seqres/pdb_seqres.txt):
Line 1360658: illegal character 0



Starting prediction...
start to load params /database/monomer.unifold.pt
start to predict T1104
Traceback (most recent call last):
  File "unifold/inference.py", line 266, in <module>
    main(args)
  File "unifold/inference.py", line 118, in main
    batch = load_feature_for_one_target(
  File "unifold/inference.py", line 61, in load_feature_for_one_target
    batch, _ = load_and_process(
  File "/Uni-Fold/unifold/dataset.py", line 233, in load_and_process
    features, labels = load(**load_kwargs, is_monomer=is_monomer)
  File "/Uni-Fold/unifold/dataset.py", line 129, in load
    all_chain_features = [
  File "/Uni-Fold/unifold/dataset.py", line 130, in <listcomp>
    load_single_feature(s, monomer_feature_dir, uniprot_msa_dir, is_monomer)
  File "/Uni-Fold/unifold/data/utils.py", line 33, in wrapper
    return copy_lib.copy(cached_func(*args, **kwargs))
  File "/Uni-Fold/unifold/dataset.py", line 72, in load_single_feature
    monomer_feature = utils.load_pickle(
  File "/Uni-Fold/unifold/data/utils.py", line 33, in wrapper
    return copy_lib.copy(cached_func(*args, **kwargs))
  File "/Uni-Fold/unifold/data/utils.py", line 67, in load_pickle
    ret = load(path)
  File "/Uni-Fold/unifold/data/utils.py", line 64, in load
    with open_fn(path, "rb") as f:
  File "/opt/conda/lib/python3.8/gzip.py", line 58, in open
    binary_file = GzipFile(filename, gz_mode, compresslevel)
  File "/opt/conda/lib/python3.8/gzip.py", line 173, in __init__
    fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
FileNotFoundError: [Errno 2] No such file or directory: '/data/output/T1104/A.feature.pkl.gz'

Problems with Unifold-Symmetry Colab

This is something that's popped up in the last few days. I've tried to run the latest version of the Unifold Colab (that also implements UF-Symmetry), and it starts off running well (after I paste in my sequence and set the right symmetry_group), installs the packages, and then crashes with the following error message(s) below. Thanks very much in advance for your consideration of this error. Terrific program (particularly the symmetry option!) and the Colab has been quite useful.

**2022-11-28 17:04:47 (4.55 MB/s) - ‘unicore-0.0.1+cu113torch1.12.1-cp37-cp37m-linux_x86_64.whl’ saved [9428354/9428354]

Cloning into 'Uni-Fold'...
DEPRECATION: A future pip version will change local packages to be built in-place without first copying to a temporary directory. We recommend you use --use-feature=in-tree-build to test your packages with this new behavior before it becomes the default.
pip 21.3 will remove support for this functionality. You can find discussion regarding this at pypa/pip#7555.

Cannot retrieve the public link of the file. You may need to change
the permission to 'Anyone with the link', or have had many accesses.

You may still be able to access the file from the browser:

 https://drive.google.com/uc?id=1A9iXMYCwP0f_U0FgISJ_6BX7FXZtglvV

tar (child): unifold_params_2022-08-01.tar.gz: Cannot open: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now

Cannot retrieve the public link of the file. You may need to change
the permission to 'Anyone with the link', or have had many accesses.

You may still be able to access the file from the browser:

 https://drive.google.com/uc?id=1UNEGzmueQTxY05QIRweKHxOjr1ht-G_Q

tar (child): uf_symmetry_params_2022-09-06.tar.gz: Cannot open: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now**

And in the UniFold Prediction on GPU section, another error/fail message:

**start to load params ./uf_symmetry.pt

FileNotFoundError Traceback (most recent call last)
in
21 times=times,
22 manual_seed=manual_seed,
---> 23 device="cuda:0", # do not change this on colab.
24 )

3 frames
/usr/local/lib/python3.7/dist-packages/torch/serialization.py in init(self, name, mode)
209 class _open_file(_opener):
210 def init(self, name, mode):
--> 211 super(_open_file, self).init(open(name, mode))
212
213 def exit(self, *args):

FileNotFoundError: [Errno 2] No such file or directory: './uf_symmetry.pt'**

URL for pre-trained Uni-Fold model parameters not working

The URL to download the Uni-Fold model parameters in the README instructions does not work:

wget https://github.com/dptech-corp/Uni-Fold/releases/download/v2.2.0/unifold_params_2022-08-01.tar.gz

model_5_multimer_v2 config

Hi,

I am trying to get model_5_multimer_v2 working, changing the multimer_af2 config accordingly. I thought it just requires disabling templating:

    c.model.template.enabled = False
    c.model.template.embed_angles = False
    recursive_set(c, "use_templates", False)
    recursive_set(c, "use_template_torsion_angles", False)

I run into the following issue:

alphafold/alphafold_iteration/evoformer/pair_activiations//weights is not a file in the archive

Any idea how to resolve it? I thought the multimer configs are identical except for templating.

Thanks in advance!

how to validate modal during training

Inferencing using multiple GPUs

Hi is there any way to use multiple GPU for inferencing,
I have 4 GPUs, wanna use all of them to inference using multimer model.

Thanks in Advance

Training Dataset

Is "A larger dataset" the training dataset? When will the data be released? Since the training data is essential for reproducing the model.

pdb_assembly.json does not agree with train_multi_label.json

Hi,

There are some entries in pdb_assembly.json that contains chains which are not listed in train_multi_label.json. Thus, the programme reports a key not found error. For example, in pdb_assembly.json, 7l89 has: {'symbol': 'C1', 'stoi': ['A3', 'B3', 'C2'], 'chains': ['F', 'D', 'C', 'E', 'B', 'A', 'H', 'L'], 'opers': ['I', 'I', 'I', 'I', 'I', 'I', 'I', 'I']}
but in train_multi_label.json dictionary, only 7l8d_B and 7l87_C have chains A, B, C, D, E, and F from 7l89 in their values. There are no records for 7l89 H or L in the train_multi_label.json

I've added some extra checking codes to dataset.py myself and now the programme works but I suppose it shouldn't be like this? I believe either pdb_assembly.json or train_multi_label is incorrect?

Cheers

convert alphafold checkpoints failed with empty output

i get some erro use convert_alphafold_to_unifold.py ：

incorrect keys: []
missing keys: []

how to check it..

Demo Case : RuntimeError & ChildFailedError

The following error occurred when I ran the demo case command：
bash train_monomer_demo.sh .

2022-12-23 10:48:24 | INFO | unicore_cli.train | task: AlphafoldTask
2022-12-23 10:48:24 | INFO | unicore_cli.train | model: AlphafoldModel
2022-12-23 10:48:24 | INFO | unicore_cli.train | loss: AlphafoldLoss
2022-12-23 10:48:24 | INFO | unicore_cli.train | num. model params: 94,169,845 (num. trained: 94,169,845)
2022-12-23 10:48:24 | INFO | unicore.utils | ***********************CUDA enviroments for all 1 workers***********************
2022-12-23 10:48:24 | INFO | unicore.utils | rank   0: capabilities =  7.5  ; total memory = 14.756 GB ; name = Tesla T4
2022-12-23 10:48:24 | INFO | unicore.utils | ***********************CUDA enviroments for all 1 workers***********************
2022-12-23 10:48:24 | INFO | unicore_cli.train | training on 1 devices (GPUs)
2022-12-23 10:48:24 | INFO | unicore_cli.train | batch size per device = 1
2022-12-23 10:48:24 | INFO | unicore.trainer | Preparing to load checkpoint ./checkpoint_last.pt
2022-12-23 10:48:24 | INFO | unicore.trainer | No existing checkpoint found ./checkpoint_last.pt
2022-12-23 10:48:24 | INFO | unicore.trainer | loading train data for epoch 1
2022-12-23 10:48:24 | INFO | unifold.dataset | load 2 chains (unique 1 sequences)
2022-12-23 10:48:24 | INFO | unifold.dataset | load 1 self-distillation samples.
2022-12-23 10:48:24 | INFO | unicore.tasks.unicore_task | get EpochBatchIterator for epoch 1
2022-12-23 10:48:25 | INFO | unicore.optim.adam | using FusedAdam
2022-12-23 10:48:25 | INFO | unicore.trainer | begin training epoch 1
2022-12-23 10:48:25 | INFO | unicore_cli.train | Start iterating over samples

Traceback (most recent call last):
  File "/home/unifold13/.conda/envs/unifold/bin/unicore-train", line 8, in <module>
    sys.exit(cli_main())
  File "/home/unifold13/.conda/envs/unifold/lib/python3.8/site-packages/unicore_cli/train.py", line 403, in cli_main
    distributed_utils.call_main(args, main)
  File "/home/unifold13/.conda/envs/unifold/lib/python3.8/site-packages/unicore/distributed/utils.py", line 190, in call_main
    distributed_main(args.device_id, main, args, kwargs)
  File "/home/unifold13/.conda/envs/unifold/lib/python3.8/site-packages/unicore/distributed/utils.py", line 164, in distributed_main
    main(args, **kwargs)
  File "/home/unifold13/.conda/envs/unifold/lib/python3.8/site-packages/unicore_cli/train.py", line 126, in main
    valid_losses, should_stop = train(args, trainer, task, epoch_itr, ckp_copy_thread)
  File "/home/unifold13/.conda/envs/unifold/lib/python3.8/contextlib.py", line 75, in inner
    return func(*args, **kwds)
  File "/home/unifold13/.conda/envs/unifold/lib/python3.8/site-packages/unicore_cli/train.py", line 216, in train
    log_output = trainer.train_step(samples)
  File "/home/unifold13/.conda/envs/unifold/lib/python3.8/contextlib.py", line 75, in inner
    return func(*args, **kwds)
  File "/home/unifold13/.conda/envs/unifold/lib/python3.8/site-packages/unicore/trainer.py", line 649, in train_step
    raise e
  File "/home/unifold13/.conda/envs/unifold/lib/python3.8/site-packages/unicore/trainer.py", line 613, in train_step
    loss, sample_size_i, logging_output = self.task.train_step(
  File "/home/unifold13/.conda/envs/unifold/lib/python3.8/site-packages/unicore/tasks/unicore_task.py", line 279, in train_step
    loss, sample_size, logging_output = loss(model, sample)
  File "/home/unifold13/.conda/envs/unifold/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/data/unifold13/applications/uniFold/Uni-Fold/unifold/loss.py", line 41, in forward
    out, config = model(batch)
  File "/home/unifold13/.conda/envs/unifold/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/data/unifold13/applications/uniFold/Uni-Fold/unifold/model.py", line 47, in forward
    outputs = self.model.forward(batch)
  File "/data/unifold13/applications/uniFold/Uni-Fold/unifold/modules/alphafold.py", line 437, in forward
    ) = self.iteration_evoformer_structure_module(
  File "/data/unifold13/applications/uniFold/Uni-Fold/unifold/modules/alphafold.py", line 364, in iteration_evoformer_structure_module
    m, z0, s0, msa_mask, m_1_prev_emb, z_prev_emb = self.iteration_evoformer(
  File "/data/unifold13/applications/uniFold/Uni-Fold/unifold/modules/alphafold.py", line 299, in iteration_evoformer
    self.embed_templates_pair(
  File "/data/unifold13/applications/uniFold/Uni-Fold/unifold/modules/alphafold.py", line 192, in embed_templates_pair
    t = self.embed_templates_pair_core(batch, z, pair_mask, tri_start_attn_mask, tri_end_attn_mask, templ_dim, multichain_mask_2d)
  File "/data/unifold13/applications/uniFold/Uni-Fold/unifold/modules/alphafold.py", line 163, in embed_templates_pair_core
    single_templates = [
  File "/data/unifold13/applications/uniFold/Uni-Fold/unifold/modules/alphafold.py", line 164, in <listcomp>
    self.template_pair_embedder(x, z)
  File "/home/unifold13/.conda/envs/unifold/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/data/unifold13/applications/uniFold/Uni-Fold/unifold/modules/embedders.py", line 257, in forward
    x = self.linear(x.type(self.linear.weight.dtype))
  File "/home/unifold13/.conda/envs/unifold/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/unifold13/.conda/envs/unifold/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 103, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: at::cuda::blas::gemm: not implemented for N3c108BFloat16E
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 66840) of binary: /home/unifold13/.conda/envs/unifold/bin/python
Traceback (most recent call last):
  File "/home/unifold13/.conda/envs/unifold/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/unifold13/.conda/envs/unifold/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/unifold13/.conda/envs/unifold/lib/python3.8/site-packages/torch/distributed/launch.py", line 193, in <module>
    main()
  File "/home/unifold13/.conda/envs/unifold/lib/python3.8/site-packages/torch/distributed/launch.py", line 189, in main
    launch(args)
  File "/home/unifold13/.conda/envs/unifold/lib/python3.8/site-packages/torch/distributed/launch.py", line 174, in launch
    run(args)
  File "/home/unifold13/.conda/envs/unifold/lib/python3.8/site-packages/torch/distributed/run.py", line 715, in run
    elastic_launch(
  File "/home/unifold13/.conda/envs/unifold/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/unifold13/.conda/envs/unifold/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
/home/unifold13/.conda/envs/unifold/bin/unicore-train FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>

------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2022-12-23_10:48:29
  host      : unfold-1
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 66840)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

How did you generate eval_sample_weight.json, train_sample_weight.json etc.

Hi,

I wonder how you decided to assign weights to these samples during training and evaluation. BTW could you please upload all these json files here? I cannot download any of these from modelscope, either via the web interface or download scripts, because of time-out error. I'd really appreciate it if these json files can be uploaded here.

Thanks

Predict from A3M directly ?

Hi!

Is it possible to use a MSA directly for the prediction ?
I already configured colabfold which takes a lot of space in my hard drive and I would like to avoid creating another database (if possible :))

THanks!
Best regards,
THibault Tubiana.

dptech-corp / uni-fold Goto Github PK

uni-fold's People

Contributors

Stargazers

Watchers

Forkers

uni-fold's Issues

hmmbuild :: profile HMM construction from multiple sequence alignments

HMMER 3.3 (Nov 2019); http://hmmer.org/

Copyright (C) 2019 Howard Hughes Medical Institute.

Freely distributed under the BSD open source license.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

input alignment file: /tmp/tmpvx3aub12/query.msa

output HMM file: /tmp/tmpvx3aub12/output.hmm

input alignment is asserted as: protein

model architecture construction: hand-specified by RF annotation

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

idx name nseq alen mlen eff_nseq re/pos description

CPU time: 0.38u 0.00s 00:00:00.38 Elapsed: 00:00:00.39

hmmsearch :: search profile(s) against a sequence database

HMMER 3.3 (Nov 2019); http://hmmer.org/

Copyright (C) 2019 Howard Hughes Medical Institute.

Freely distributed under the BSD open source license.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

query HMM file: /tmp/tmpil8uqp24/query.hmm

target sequence database: /home/data/pdb_seqres/pdb_seqres.txt

MSA of all hits saved to file: /tmp/tmpil8uqp24/output.sto

show alignments in output: no

sequence reporting threshold: E-value <= 100

domain reporting threshold: E-value <= 100

sequence inclusion threshold: E-value <= 100

domain inclusion threshold: E-value <= 100

MSV filter P threshold: <= 0.1

Vit filter P threshold: <= 0.1

Fwd filter P threshold: <= 0.1

number of worker threads: 8

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

**start to load params ./uf_symmetry.pt

Recommend Projects

Recommend Topics

Recommend Org