Giter Site home page Giter Site logo

haddocking / cport Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 3.0 1.24 MB

CPORT is a Consensus Prediction Of interface Residues in Transient complexes used to predict protein-protein interface residues.

License: Apache License 2.0

Python 79.31% PureBasic 20.69%
bioinformatics computational-biology deep-learning deep-neural-networks machine-learning meta-predictor predictor protein-protein-interaction research-software utrecht-university

cport's People

Contributors

aldovdn avatar amjjbonvin avatar apalazis avatar brianjimenez avatar dependabot[bot] avatar palazis avatar rvhonorato avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cport's Issues

ERROR in ISPRED

I Installed cport and Encountered this error how can I fix that

cport example/1PPE.pdb A
2023-10-19 13:33:08.396271: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-10-19 13:33:08.529716: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-10-19 13:33:08.530431: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-10-19 13:33:10.259470: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
[2023-10-19 13:33:11,839 cli:L142 INFO] ------------------------------------------
[2023-10-19 13:33:11,839 cli:L143 INFO] Welcome to CPORT v0.2.0-alpha
[2023-10-19 13:33:11,839 cli:L144 INFO] ------------------------------------------
[2023-10-19 13:33:11,840 loader:L357 INFO] Running method: scriber
[2023-10-19 13:33:11,840 scriber:L210 INFO] Running SCRIBER
[2023-10-19 13:33:11,841 loader:L357 INFO] Running method: sppider
[2023-10-19 13:33:11,841 scriber:L211 INFO] Will try 36 times waiting 30s between tries
[2023-10-19 13:33:11,841 loader:L357 INFO] Running method: scannet
[2023-10-19 13:33:11,841 sppider:L190 INFO] Running SPPIDER
[2023-10-19 13:33:11,842 scannet:L176 INFO] Running ScanNet
[2023-10-19 13:33:11,842 loader:L357 INFO] Running method: ispred4
[2023-10-19 13:33:11,842 sppider:L191 INFO] Will try 36 times waiting 45s between tries
[2023-10-19 13:33:11,842 scannet:L177 INFO] Will try 36 times waiting 30s between tries
[2023-10-19 13:33:11,843 ispred4:L198 INFO] Running ISPRED4
[2023-10-19 13:33:11,844 ispred4:L199 INFO] Will try 36 times waiting 60s between tries
/home/ankush/.local/lib/python3.10/site-packages/Bio/PDB/StructureBuilder.py:87: PDBConstructionWarning: WARNING: Chain E is discontinuous at line 2361.
warnings.warn("WARNING: Chain %s is discontinuous at line %i."
/home/ankush/.local/lib/python3.10/site-packages/Bio/PDB/StructureBuilder.py:87: PDBConstructionWarning: WARNING: Chain I is discontinuous at line 2487.
warnings.warn("WARNING: Chain %s is discontinuous at line %i."
/home/ankush/.local/lib/python3.10/site-packages/Bio/SeqIO/PdbIO.py:217: BiopythonWarning: Ignoring out-of-order residues after a gap
warnings.warn("Ignoring out-of-order residues after a gap",
Exception in thread scriber:
Traceback (most recent call last):
File "/home/ankush/miniconda3/envs/pgml/lib/python3.10/threading.py", line 1009, in _bootstrap_inner
self.run()
File "/home/ankush/cport/src/cport/modules/threadreturn.py", line 14, in run
self._return = self._target(self._args, **self._kwargs)
File "/home/ankush/cport/src/cport/modules/loader.py", line 359, in run_prediction
result = predictor_func()
File "/home/ankush/cport/src/cport/modules/loader.py", line 85, in run_scriber
predictions = scriber.run()
File "/home/ankush/cport/src/cport/modules/scriber.py", line 213, in run
submitted_url = self.submit()
File "/home/ankush/cport/src/cport/modules/scriber.py", line 53, in submit
fasta_string = get_fasta_from_pdbfile(
File "/home/ankush/cport/src/cport/modules/utils.py", line 138, in get_fasta_from_pdbfile
return sequence
UnboundLocalError: local variable 'sequence' referenced before assignment
[2023-10-19 13:33:14,709 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 36
Exception in thread scannet:
Traceback (most recent call last):
File "/home/ankush/miniconda3/envs/pgml/lib/python3.10/threading.py", line 1009, in _bootstrap_inner
self.run()
File "/home/ankush/cport/src/cport/modules/threadreturn.py", line 14, in run
self._return = self._target(self._args, **self._kwargs)
File "/home/ankush/cport/src/cport/modules/loader.py", line 359, in run_prediction
result = predictor_func()
File "/home/ankush/cport/src/cport/modules/loader.py", line 268, in run_scannet
predictions = scannet.run()
File "/home/ankush/cport/src/cport/modules/scannet.py", line 179, in run
submitted_url = self.submit()
File "/home/ankush/cport/src/cport/modules/scannet.py", line 64, in submit
browser.follow_link(browser.links()[7])
IndexError: list index out of range
[2023-10-19 13:34:15,419 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 35
[2023-10-19 13:34:19,375 sppider:L106 DEBUG] Waiting for SPPIDER to finish... 36
[2023-10-19 13:35:16,100 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 34
[2023-10-19 13:35:37,263 sppider:L106 DEBUG] Waiting for SPPIDER to finish... 35
[2023-10-19 13:36:16,808 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 33
[2023-10-19 13:36:54,362 sppider:L106 DEBUG] Waiting for SPPIDER to finish... 34
[2023-10-19 13:37:17,532 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 32
[2023-10-19 13:37:40,849 loader:L109 INFO] {'active': [38, 39, 57, 61, 62, 74, 75, 92, 93, 96, 114, 117, 119, 120, 121, 122, 125, 127, 132, 133, 135, 137, 151, 159, 202, 203, 236, 239, 240], 'passive': []}
[2023-10-19 13:38:18,356 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 3 1
[2023-10-19 13:39:19,044 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 3 0
[2023-10-19 13:40:19,747 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 2 9
[2023-10-19 13:41:20,422 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 2 8
[2023-10-19 13:42:21,113 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 2 7
[2023-10-19 13:43:21,928 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 2 6
[2023-10-19 13:44:22,622 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 2 5
[2023-10-19 13:45:23,383 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 2 4
[2023-10-19 13:46:24,079 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 2 3
[2023-10-19 13:47:24,772 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 2 2
[2023-10-19 13:48:25,595 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 2 1
[2023-10-19 13:49:26,287 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 2 0
[2023-10-19 13:50:26,958 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 1 9
[2023-10-19 13:51:27,637 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 18
[2023-10-19 13:52:28,314 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 17
[2023-10-19 13:53:29,134 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 16
[2023-10-19 13:54:29,815 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 15
[2023-10-19 13:55:30,506 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 14
[2023-10-19 13:56:31,183 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 13
[2023-10-19 13:57:31,889 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 12
[2023-10-19 13:58:32,708 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 11
[2023-10-19 13:59:33,412 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 10
[2023-10-19 14:00:34,114 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 9
[2023-10-19 14:01:34,812 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 8
[2023-10-19 14:02:35,484 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 7
[2023-10-19 14:03:36,310 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 6
[2023-10-19 14:04:37,003 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 5
[2023-10-19 14:05:37,693 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 4
[2023-10-19 14:06:38,383 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 3
[2023-10-19 14:07:39,068 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 2
[2023-10-19 14:08:39,893 ispred4:L118 DEBUG] Waiting for ISPRED4 to finish... 1
[2023-10-19 14:09:40,582 ispred4:L125 ERROR] ISPRED4 server is not responding, url was https://ispred4.biocomp.unibo.it/ispred/default/job_summary?jobid=7ed2a43a-b55b-4fc0-8097-addc024355fc
/home/ankush/.local/lib/python3.10/site-packages/Bio/PDB/StructureBuilder.py:87: PDBConstructionWarning: WARNING: Chain E is discontinuous at line 2361.
warnings.warn("WARNING: Chain %s is discontinuous at line %i."
/home/ankush/.local/lib/python3.10/site-packages/Bio/PDB/StructureBuilder.py:87: PDBConstructionWarning: WARNING: Chain I is discontinuous at line 2487.
warnings.warn("WARNING: Chain %s is discontinuous at line %i."
Traceback (most recent call last):
File "/home/ankush/miniconda3/envs/pgml/bin/cport", line 33, in
sys.exit(load_entry_point('cport', 'console_scripts', 'cport')())
File "/home/ankush/cport/src/cport/cli.py", line 117, in maincli
cli(argument_parser, main)
File "/home/ankush/cport/src/cport/cli.py", line 112, in cli
main_func(**vars(cmd))
File "/home/ankush/cport/src/cport/cli.py", line 192, in main
format_output(
File "/home/ankush/cport/src/cport/modules/utils.py", line 153, in format_output
standardized_dic = standardize_residues(result_dic, chain_id, pdb_file)
File "/home/ankush/cport/src/cport/modules/utils.py", line 259, in standardize_residues
reslist = get_residue_list(pdb_file, chain_id)
File "/home/ankush/cport/src/cport/modules/utils.py", line 337, in get_residue_list
chain = model[chain_id]
File "/home/ankush/.local/lib/python3.10/site-packages/Bio/PDB/Entity.py", line 40, in getitem
return self.child_dict[id]
KeyError: 'A'

Use poetry as dependency manager?

We have already quite some complicated dependencies, we should both minimize it and check if poetry is a good option for dependency management.

Review non-python files

Because of licensing issues not all non-python files should be present in this repository

Add output file

Currently we are simply printing all the residues in the stdout, we should write these to disk in a csv format.

Add a "predictor loader" logic

Each predictor has a very different behaviour and thus needs its own code, we should try to re-utilize the functions the best as possible and design a robust predictor loader

Update install instructions

Now that some predictors need chromedriver, we should update the install instructions accordingly, else users might get an error such as:

 [2022-05-25 15:44:59,962 cli:L138 ERROR] Message: 'chromedriver' executable needs to be in PATH. Please see https://chromedriver.chromium.org/home

Update modules to account for MechanicalSoup update

MechanicalSoup was updated to v1.3.0 with #72, which mitigates CVE-2023-34457.

This update was passing the tests, however the usage of mechanical soup was not covered ๐Ÿซ 

Now, running the example results in:

Exception has occurred: ValueError
From v1.3.0 onwards, you must pass an open file object directly, e.g. `form["name"] = open("/path/to/file", "rb")`. This change is to remediate a security vulnerability where a malicious web server could read arbitrary files from the client (CVE-2023-34457).
  File "/home/rodrigo/repos/cport/src/cport/modules/ispred4.py", line 55, in submit
    input_form.set(name="structure", value=self.pdb_file)
  File "/home/rodrigo/repos/cport/src/cport/modules/ispred4.py", line 197, in run
    submitted_url = self.submit()
  File "/home/rodrigo/repos/cport/src/cport/modules/loader.py", line 62, in run_ispred4
    predictions = ispred4.run()
  File "/home/rodrigo/repos/cport/src/cport/modules/loader.py", line 359, in run_prediction
    result = predictor_func()
  File "/home/rodrigo/repos/cport/src/cport/modules/threadreturn.py", line 14, in run
    self._return = self._target(self._args, **self._kwargs)
ValueError: From v1.3.0 onwards, you must pass an open file object directly, e.g. `form["name"] = open("/path/to/file", "rb")`. This change is to remediate a security vulnerability where a malicious web server could read arbitrary files from the client (CVE-2023-34457).

Code review: server variable

You could use server variable to name the functions and refactor the code to avoid repetition of the line:

p = Process(target=func1, args=(i, predictors_dic, params, main_dir))
p = Process(target=func2, args=(i, predictors_dic, params, main_dir))
...

cport/cport.py

Line 40 in 4022764

if server == "promate":

Incorrect residue number parsing for `predictprotein`

When reading the results from the predictor, the file is loaded into a pandas dataframe and the following logic follows:

        for row in final_predictions.itertuples():
            if row.Protein_Pred == 1:  # 1 indicates interaction
                prediction_dict["active"].append(row[0])
            elif row.Protein_Pred == 0:  # 0 indicates no interaction
                prediction_dict["passive"].append(row[0])
            else:
                log.warning(
                    f"There appears that residue {row} is either empty or unprocessable"
                )

However row[0] is 0-index and does not relate do the predictores residue numbering:

> final_predictions.head()
  Residue_Number  Protein_Pred
0          Res_1             0
1          Res_2             0
2          Res_3             0
3          Res_4             0
4          Res_5             0

[5 rows x 2 columns]
> row[0]
0
> row
Pandas(Index=0, Residue_Number='Res_1', Protein_Pred=0)
> row[1]
'Res_1'

We should therefore parse row[1] to get the correct residue number.

Remove '\n'

'\n' is not cross-platform and os.linesep should be used instead:

cport/cport.py

Line 139 in 4022764

print("Update the threshold values based on the successful predictors\n")

Please replace all instances of '\n'.

Add option to select servers to poll

It would be nice to add an option to the command line to specify the meta-servers to poll, for example:

./cport.py --servers wp

This would only poll WHISCY and PRO-MATE servers, but not the rest.

muscle3 binary

muscle3 binary should not be distributed or included in the cport repository. As an external dependency, it could be specified via env variables or a configuration file.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.