fteufel / signalp-6.0 Goto Github PK
View Code? Open in Web Editor NEWMulti-class signal peptide prediction and structure decoding model.
Home Page: https://services.healthtech.dtu.dk/service.php?SignalP-6.0
License: Other
Multi-class signal peptide prediction and structure decoding model.
Home Page: https://services.healthtech.dtu.dk/service.php?SignalP-6.0
License: Other
Hi there,
im running this software as part of a workflow to eventually generate a trinotate.xls.
However the issue im facing is time. My script has been running for 4 days and a bit and the max is 7 days.
is there a way to do this in sections so it can continue running
or
if i submit the script again will it be able to remember where it left from and continue.
kind regards
George
Dear author,
You prediction server is constantly returning this message when requested for analysis:
Exception: WebfaceSystemError
Package: Webface::service : 712
Message: Unable to create /var/www/webface/tmp/server/SignalP-6.0/659D1C91000040FA921FC790 : No space left on device
Kindly look into it.
Regards.
Is the code for the restricted single linkage clustering of the dataset available? Thank you very much.
Hi, I’m currently working with SignalP. I have some questions and I would very much appreciate it if you can please help me with them.
There’s a parameter named “--bsize”, which according to the instructions: “--bsize, -bs is the integer batch size used for prediction. When running on GPU, this should be adjusted to maximize usage of the available memory. On CPU, the choice usually has only a limited effect on performance. Defaults to 10.”
What would be the best way to figure out the value that maximizes the available memory?
We are running SignalP on CPU, but we have seen significant improvements increasing the bsize value. We just need to figure out how to come up with the ideal value based on dataset size, number of CPUs, etc.
Hi,
I'm trying to install signalp6 in a singularity/apptainer container to be able to run it on an hpc. When running the installation following error occurs:
[...]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.1/2.1 MB 21.1 MB/s eta 0:00:00
Collecting triton==2.0.0
Downloading triton-2.0.0-1-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (63.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 63.3/63.3 MB 14.7 MB/s eta 0:00:00
Collecting nvidia-cusparse-cu11==11.7.4.91
Downloading nvidia_cusparse_cu11-11.7.4.91-py3-none-manylinux1_x86_64.whl (173.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╸ 153.8/173.2 MB 19.6 MB/s eta 0:00:01
ERROR: THESE PACKAGES DO NOT MATCH THE HASHES FROM THE REQUIREMENTS FILE. If you have updated the package versions, please update the hashes. Otherwise, examine the package contents carefully; someone may have tampered with them.
nvidia-cusparse-cu11==11.7.4.91 from https://files.pythonhosted.org/packages/ea/6f/6d032cc1bb7db88a989ddce3f4968419a7edeafda362847f42f614b1f845/nvidia_cusparse_cu11-11.7.4.91-py3-none-manylinux1_x86_64.whl#sha256=a3389de714db63321aa11fbec3919271f415ef19fda58aed7f2ede488c32733d (from torch>1.7.0->signalp6==6.0+g):
Expected sha256 a3389de714db63321aa11fbec3919271f415ef19fda58aed7f2ede488c32733d
Got b253327205118db8b9d6ef5e7257f310d1024e115b8a4f6e21f1bc5b02b9a598
FATAL: While performing build: while running engine: exit status 1
The os is ubuntu:22.04
I used the non-dev version from https://services.healthtech.dtu.dk/services/SignalP-6.0/
The complete installation routine is:
apt-get update -y
apt-get upgrade -y
apt-get install --no-install-recommends -y curl bash vim git ca-certificates tar unzip gzip wget gcc build-essential cpanminus
apt-get install --no-install-recommends -y sqlite3 ncbi-blast+ hmmer infernal infernal-doc
apt-get install --no-install-recommends -y python3 python3-pip
cd /usr/local/src
tar -xvzf signalp-6.0g.fast.tar.gz && cd signalp6_fast
pip3 install signalp-6-package/
Processing ./signalp-6-package
Installing build dependencies: started
Installing build dependencies: finished with status 'done'
Getting requirements to build wheel: started
Getting requirements to build wheel: finished with status 'done'
Preparing metadata (pyproject.toml): started
Preparing metadata (pyproject.toml): finished with status 'done'
Collecting torch>1.7.0
Downloading torch-2.0.1-cp310-cp310-manylinux1_x86_64.whl (619.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 619.9/619.9 MB 831.6 kB/s eta 0:00:00
Collecting matplotlib>3.3.2
Downloading matplotlib-3.7.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.6/11.6 MB 23.1 MB/s eta 0:00:00
Collecting tqdm>4.46.1
Downloading tqdm-4.65.0-py3-none-any.whl (77 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 77.1/77.1 KB 13.8 MB/s eta 0:00:00
Collecting numpy>1.19.2
Downloading numpy-1.24.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 17.3/17.3 MB 21.1 MB/s eta 0:00:00
Collecting pillow>=6.2.0
Downloading Pillow-9.5.0-cp310-cp310-manylinux_2_28_x86_64.whl (3.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.4/3.4 MB 23.6 MB/s eta 0:00:00
Collecting pyparsing>=2.3.1
Downloading pyparsing-3.0.9-py3-none-any.whl (98 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 98.3/98.3 KB 19.3 MB/s eta 0:00:00
Collecting fonttools>=4.22.0
Downloading fonttools-4.39.4-py3-none-any.whl (1.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.0/1.0 MB 22.8 MB/s eta 0:00:00
Collecting contourpy>=1.0.1
Downloading contourpy-1.0.7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (300 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 300.3/300.3 KB 19.1 MB/s eta 0:00:00
Collecting cycler>=0.10
Downloading cycler-0.11.0-py3-none-any.whl (6.4 kB)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib>3.3.2->signalp6==6.0+g) (23.1)
Collecting python-dateutil>=2.7
Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 247.7/247.7 KB 7.9 MB/s eta 0:00:00
Collecting kiwisolver>=1.0.1
Downloading kiwisolver-1.4.4-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 18.2 MB/s eta 0:00:00
Collecting nvidia-nvtx-cu11==11.7.91
Downloading nvidia_nvtx_cu11-11.7.91-py3-none-manylinux1_x86_64.whl (98 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 98.6/98.6 KB 16.9 MB/s eta 0:00:00
Collecting nvidia-cuda-runtime-cu11==11.7.99
Downloading nvidia_cuda_runtime_cu11-11.7.99-py3-none-manylinux1_x86_64.whl (849 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 849.3/849.3 KB 16.0 MB/s eta 0:00:00
Collecting nvidia-cuda-cupti-cu11==11.7.101
Downloading nvidia_cuda_cupti_cu11-11.7.101-py3-none-manylinux1_x86_64.whl (11.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.8/11.8 MB 16.2 MB/s eta 0:00:00
Collecting sympy
Downloading sympy-1.12-py3-none-any.whl (5.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.7/5.7 MB 24.9 MB/s eta 0:00:00
Collecting nvidia-cublas-cu11==11.10.3.66
Downloading nvidia_cublas_cu11-11.10.3.66-py3-none-manylinux1_x86_64.whl (317.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 317.1/317.1 MB 2.8 MB/s eta 0:00:00
Collecting nvidia-curand-cu11==10.2.10.91
Downloading nvidia_curand_cu11-10.2.10.91-py3-none-manylinux1_x86_64.whl (54.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 54.6/54.6 MB 13.1 MB/s eta 0:00:00
Collecting nvidia-cusolver-cu11==11.4.0.1
Downloading nvidia_cusolver_cu11-11.4.0.1-2-py3-none-manylinux1_x86_64.whl (102.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 102.6/102.6 MB 11.1 MB/s eta 0:00:00
Collecting networkx
Downloading networkx-3.1-py3-none-any.whl (2.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.1/2.1 MB 21.1 MB/s eta 0:00:00
Collecting triton==2.0.0
Downloading triton-2.0.0-1-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (63.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 63.3/63.3 MB 14.7 MB/s eta 0:00:00
Collecting nvidia-cusparse-cu11==11.7.4.91
Downloading nvidia_cusparse_cu11-11.7.4.91-py3-none-manylinux1_x86_64.whl (173.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╸ 153.8/173.2 MB 19.6 MB/s eta 0:00:01
ERROR: THESE PACKAGES DO NOT MATCH THE HASHES FROM THE REQUIREMENTS FILE. If you have updated the package versions, please update the hashes. Otherwise, examine the package contents carefully; someone may have tampered with them.
nvidia-cusparse-cu11==11.7.4.91 from https://files.pythonhosted.org/packages/ea/6f/6d032cc1bb7db88a989ddce3f4968419a7edeafda362847f42f614b1f845/nvidia_cusparse_cu11-11.7.4.91-py3-none-manylinux1_x86_64.whl#sha256=a3389de714db63321aa11fbec3919271f415ef19fda58aed7f2ede488c32733d (from torch>1.7.0->signalp6==6.0+g):
Expected sha256 a3389de714db63321aa11fbec3919271f415ef19fda58aed7f2ede488c32733d
Got b253327205118db8b9d6ef5e7257f310d1024e115b8a4f6e21f1bc5b02b9a598
FATAL: While performing build: while running engine: exit status 1
It would be great if you could check what might be wrong with the hash.
Thank you
Best,
Nadine
pls tell
Hello, I got the following warning messages running signalp6 with "--organism euk " and got these warnings:
Unknown behaviour encountered for sequence no. 12101. Please check outputs.
Unknown behaviour encountered for sequence no. 16029. Please check outputs.
Unknown behaviour encountered for sequence no. 25368. Please check outputs.
Are they sequence numbers on the original file? If so these are the sequences:
>BSUD.16781.1.p1
MISLLTTFFLLLSPKVAGDCYGDTIARQKRLLGEMLDMSPAILSMEKLHADSVQQTLHEM
EIFQKYQFAEITPYEYKKLLLSRYLVFSASFALNICNQYGARLVEIMEEDERKAIAVMLD
LSDTPVDCLVIGTRFDKGDWTYWYSTRPAYRTLSTEENQQKGNCMILENQKSWNMTRVPC
LRTYFCHFMCEISMK*
>BSUD.18903.3.p1
MKPDHHAKMAAQMKERLKVEELAENIDELEDVVENAFPVLLVTVLVSIFLLAIFLVRMYL
RYTVENPSKNRMDGKTVLITGATSGLGKATAIELARKNARVLITGRDKIKVEAVARNIRK
KTGNQHVNALVLDLANLRGIREFCEAFCKDEKYLHVLINNAAYMGPKAATDDNLERCFGV
NYLGHFYLTYLLSDKLKKNAPSRVINVVSDSYAIGQLDFDDIALNKGYDVFKAYARSKYA
MMLWNLEHHRRTYSSCIWTFAVHPGACATELLRNYPGLTGNLLRIVSRIMFKAPEDGCQT
IVYLAVADGLREFSGKTFANCKVIKTQDRIKDKEVAKELWNISAHLCGFEPDTPYEEQES
TEAKETTTSDSPTADIAAAAAVSEQKKDK*
>BSUD.2410.1.p1
MKNRPSAAFRASAKPPTYCKMESQKEDEEDDGKGSRTMVLAGGGDGSNTEGAVPAKGGCG
EGRVLIFLVLGVLTLVFSGVLIGIYMNIRTLTSSLDVIEVMPSFVPAAAGGLAGLFLLGL
FWKRCVVLVYPVLVLCAVSTGLSIIIAVLTGTHVLQPLLSVSGCVYTRKGNICQCLTQFK
RDKLDLERVNAGETVYLALHNVSSCEDVQTVIPTMLYTMIGIYGLLALVSAVAGIISFLV
YRTERNRNYLDDTDYDEDEDSSPSTPSSNTDNYTEHQNMLSSRQANVTTAASVGNIYTNT
NATTNDEETNNDDGNTTPSDLTYNPSDAPMGYTEACKMRRCQSFTQPHKGAAREGSPGSS
ESGQTASSLMSSDRVADGGAIRLKENRKKGRRAVTLHGLDRDQLLLILSLQMRYLQESEQ
LAKKECQSALNLNNINKPNRTNVKNPNATSSDTSQENIDTSNTFSHFQRRAMTPTPRQER
LSSSRATNDDLDYKPAKQVRSHTPQPYHFKVQHNGVPATMGPVLLPNIPAQYQVEQQLQP
LQVQHLQQQQLQQQQQFLQFQQQQQLAQLQQQQQKIQMQQQLQLQQHLQVQQMSQHLPMQ
QPILVQQPALQSANHNMENESLTMITYDLRSVQTTGPIVYENVPSSRTSLNYYSPDGSCS
SNSSSLLRATNLPGYPSLSSNSVPTQTQPMPQPVESSPIKVLGGNIQTTTPNSPGQISRQ
TSEISSPSHSGNDSINTEQKQPGKSPESTPETQPAKAKGKKKLSKKEKNAKKEEEKTATT
DSTKTSTLGRTDSNASKAGSVSDRWQAVLPDGKAQAQTLWENVQRKIVSDPQTPDSTLTF
PHSSHPTSQTSVQPSAPNHFPNSSLVVPNGILKKTKSVPYSQQSSPNLAPVSPQSLPQFD
SQVPSNIYSSYNQMTNTPSYNAFIPIPTSSTDNYEDIDDFSRANQHAQQVQQDVPPPKPA
RLHARKPAPGAEAGDTLDSQGQQLRPKSYLSAVDRESMAAASLTSMGNVPCQPAYEGTNK
QSLTHMQMEGPVRDERSGGIVPAGGNDQLIMIYGDLYAQPRRKSIPTNLPLLPSLNQSHQ
MYQNSDIDHNLMDNQARAQYRLDVHQPQRSAFHMLGHTYHGDRSGHLPSNEICDTDLDEL
PIPRWNSRYHRSQSFSPPPYTPPPVYQSLESVGKYPSIRSTSSSSSDPHNSSSEGSTLDN
IQPGFTNNRLQGPPLSMSRGRQPAHVTDRRFCLTQQRQPPPNLYNHSGRRLNLKSDYDSF
RDRRVDQEPPPQVAIRRCQSVEEGNRKRLTSGQHLVNGKMVPHNNYYPNIDNIRTTSDGQ
NLENNLKMGPSVNRVRPIHNGSVPNTEQPAFQKGVQNFQNEAYIPNKGCQARKFPGDVSE
EDLSCSIDTDSVISDSSSQEVCPNKELNGFITHGRALESSDSDKDDYAETVI*
What is the cause? Thanks
Hi there!
Thanks again for your great work.
It would be really helpful if the distributed package could please have the current version numbers in them please!
I maintain a pipeline that runs SignalP6 (https://github.com/ccdmb/predector), and i've been getting bug reports from people.
It would be helpful to know exactly which version of signalp 6 is actually installed, so that I know whether to report a new bug to you or ask them to update.
It also helps me to cache pre-computed results, and automatically skip repeated analysis if an identical version has been run before.
Currently the -h
--help
flag lists usage: SignalP 6.0 Signal peptide prediction tool [-h] --fastafile FASTAFILE
.
And setup.py
lists version='1.0'
.
It would be helpful if you could tag the full version info e.g. 6.0e
in these places.
Or a special --version
flag is becoming more common.
I use the bump2version to do this automatically with a git tag for me.
I know this is pretty low priority so feel absolutely free to say no.
It would just be helpful.
Thanks and all the best,
Darcy
Hi,
The web version does not detect signal peptide but the linux version (--mode slow-sequential) does.
signalp6 -fasta /path/to/input.fasta -org euk --output_dir path/to_be_saved --format txt --mode slow-sequential
output:
ID Prediction OTHER SP(Sec/SPI) CS Position
test1 SP 0.317355 0.68265 CS pos: 59-60. Pr: 0.7032
input.fasta
>test1
MVITHLPAKIWRTLVIIRCVGLSESLKIFSGEALCVHHTLPKHIFLLLLLVLLADLVPGQPVGGRTCPKINTRKEWRQLS
RESQASYLKAVKCLTTKPTTLRTRFRLRHYDDFQYVHSTLYMQGTDIWYSSTSKVSVNVVIRMVFRTGIGA
What is the threshold for selection?
Thanks
Hello, how do you pass in the model and parameters when converting this model into an onnx model?
The operation code is as follows:
model = torch.load("best_weight.pth", map_location="cpu")
dummy_input = (1, 3, 224, 224)
torch.onnx.export(model, dummy_input, f, verbose=False, input_names=input_names, output_names=output_names)
Is this right?
Hi, I found that signaIP-6.0 was trained to identify signal peptides of Sec or Tat pathway. However, when I tested signaIP-6.0 on the secreted protein DSBA_ECOLI which is reported to be SRP-dependent, I get the following output:
It surprised me that the signal sequence was identified quite well, consistent with the reported one. But the signal peptide of DSBA_ECOLI is also classified to the Sec pathway. I think this phenomenon is interesting. Could you give some possible explanations?
Hi community,
Signalp-6.0 is a wonderful tool, but I can't access to the website (https://services.healthtech.dtu.dk/service.php?SignalP-6.0.) for online service. Is there anything wrong here?
Thanks,
Nemo
Hi fteufel,
Signalp-6.0 is an outstanding software for signal peptide prediction, and I ran this software with parameter: --torch_num_thread 25, but it raised an error
Traceback (most recent call last):
File "/data/conda_envs/signalp6/bin/signalp6", line 33, in <module>
sys.exit(load_entry_point('signalp6==6.0+h', 'console_scripts', 'signalp6')())
File "/data/conda_envs/signalp6/lib/python3.7/site-packages/signalp6-6.0+h-py3.7.egg/signalp/__init__.py", line 6, in predict
main()
File "/data/conda_envs/signalp6/lib/python3.7/site-packages/signalp6-6.0+h-py3.7.egg/signalp/predict.py", line 144, in main
torch.set_num_threads(args.torch_num_threads)
RuntimeError: set_num_threads expects an int, but got str
Then I checked line 144 in predict.py
torch.set_num_threads(args.torch_num_threads)
and I found the error was raised because parameter was passed with a string format, then I use int
function like that
torch.set_num_threads(int(args.torch_num_threads))
and it worked. So I wonder if you can fix this problem in the later version of Signalp-6.0. Thanks for the useful tool again!
Best,
Zewei
Hi there!
Thanks for your great work.
I was testing out the update and came along an issue while running on the signalp5 benchmark set during the marginal conflict resolution step.
This sequence:
>A0R1E8|POSITIVE|LIPO|2
MTQNCVAPVAIIGMACRLPGAINSPQQLWEALLRGDDFVTEIPTGRWDAEEYYDPEPGVPGRSVSKWGAF
from https://services.healthtech.dtu.dk/services/SignalP-6.0/public_data/benchmark_set_sp5.fasta
appears to be the issue.
Running version 6.0b in "fast" mode with this sequence in both other and eukaryote organisms causes the following error.
$ signalp6 --output_dir test --format txt --organism euk --mode fast --fastafile test.fasta
/home/ubuntu/miniconda3/envs/sp6/lib/python3.6/site-packages/torch/nn/modules/module.py:1051: UserWarning: where received a uint8 condition tensor. This behavior is deprecated and will be removed in a future version of PyTorch. Use a boolean condition instead. (Triggered internally at /tmp/pip-req-build-1ky46svp/aten/src/ATen/native/TensorCompare.cpp:255.)
return forward_call(*input, **kwargs)
Predicting: 100%|| 1/1 [00:00<00:00, 1.53batch/s]
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/sp6/bin/signalp6", line 8, in <module>
sys.exit(predict())
File "/home/ubuntu/miniconda3/envs/sp6/lib/python3.6/site-packages/signalp/__init__.py", line 6, in predict
main()
File "/home/ubuntu/miniconda3/envs/sp6/lib/python3.6/site-packages/signalp/predict.py", line 235, in main
resolve_viterbi_marginal_conflicts(global_probs, marginal_probs, cleavage_sites, viterbi_paths)
File "/home/ubuntu/miniconda3/envs/sp6/lib/python3.6/site-packages/signalp/utils.py", line 254, in resolve_viterbi_marginal_conflicts
cleavage_sites[i] = sp_idx.max() +1
File "/home/ubuntu/miniconda3/envs/sp6/lib/python3.6/site-packages/numpy/core/_methods.py", line 39, in _amax
return umr_maximum(a, axis, None, out, keepdims, initial, where)
ValueError: zero-size array to reduction operation maximum which has no identity
This doesn't appear to be an issue with the previous version available for download.
Both have identical main dependency versions:
python 3.6.13
numpy 1.19.5
pytorch 1.9.1
tqdm 4.62.3
Thanks in advance,
Darcy
Hello,When I used signalP 6.0 to predict signal peptides, I found that there are negative values for Cleavage sites in the output.gff3 file. What is the reason?
Example:
"RS_3_bin.18_00081 hypothetical protein SignalP-6.0 signal_peptide 1 -1 0.56286603 . . Note=TAT"
Looking forward to your reply.
David
The mode slow does not seem to work:
my command is:
signalp6 --fastafile /home/barthelemy/beta/A.fasta --organism other --output_dir /home/barthelemy/beta/signalP_out --format txt --mode slow
Does slow-sequential produce the same output as slow ?
best
barthelemy
Hi,
SignalP6 is out and it looks great. I started to work on it but getting following error.
Installed as follow:
python3 -m pip install signalp-6-package/
SIGNALP_DIR=$(python3 -c "import signalp; import os; print(os.path.dirname(signalp.file))" )
cp -r signalp-6-package/models/* $SIGNALP_DIR/model_weights/
signalp6 -fasta /path/to/input.fasta -org euk --output_dir path/to/be/saved --format txt --mode slow
Traceback (most recent call last):
File "/home/.local/lib/python3.6/site-packages/signalp/predict.py", line 207, in main
model = torch.jit.load(SLOW_MODEL_PATH)
File "/home/.local/lib/python3.6/site-packages/torch/jit/_serialization.py", line 151, in load
raise ValueError("The provided filename {} does not exist".format(f)) # type: ignore[str-bytes-safe]
ValueError: The provided filename /home/.local/lib/python3.6/site-packages/signalp/model_weights/ensemble_model_signalp6.pt does not exist
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/.local/bin/signalp6", line 8, in
sys.exit(predict())
File "/home/.local/lib/python3.6/site-packages/signalp/init.py", line 6, in predict
main()
File "/home/.local/lib/python3.6/site-packages/signalp/predict.py", line 211, in main
raise FileNotFoundError(f"Slow mode requires the full model to be installed at {SLOW_MODEL_PATH}. Missing from this installation or incorrectly configured.")
FileNotFoundError: Slow mode requires the full model to be installed at /home/.local/lib/python3.6/site-packages/signalp/model_weights/ensemble_model_signalp6.pt. Missing from this installation or incorrectly configured.
I could not find the model "ensemble_model_signalp6.pt" where can i download it from?
The model present is "sequential_models_signalp6".
Thanks
The package is not installing via the command "pip install signalp-6-package/"_. When I installed it via "python setup.py install" command, no "signalp-6-package/models/" directory was found and no model files were present in any folder of the directory. Consequently, the package is not running well.
It used to work nice before when i installed it on my system a few months ago, package was installed via "pip install signalp-6-package/" and upon installing "signalp-6-package/models/" directory was created in which i copied the model file "distilled_model_signalp6.pt" to "signalp/model_weights/" directory and then pkg started running. I am facing errors in the same steps now!
I would appreciate the help you would provide, as I am in need of this python package. Awaiting your kind response.
Hello,
I am using signalP6 but I am having a hard time retrieving the proteins that are supposed to have the P signal in the files named "processed_entries", as I understood from previous versions of the program, these files were supposed to contain only the proteins that are predicted to have the P signal, but with their signal peptide sequences removed. But instead I get a fasta, although without the ">" in the header name, which contains all the proteins I provided as input, although in the "prediction_results.txt" file it is clear that not all the proteins provided carry the signal. This file looks like this:
FRA0004_1 NO_SP 1.000040 0.000000
FRA0004_2 NO_SP 1.000039 0.000004
FRA0004_3 SP 0.000306 0.999679 CS pos: 18-19. Pr: 0.9627
FRA0004_4 NO_SP 1.000064 0.000000
.
.
.
But the "fasta" file looks like this:
FRA0004_1
MKDIVSAISHRFHGLSKSKAAV...
FRA0004_2
MEGLRRSLIEGSSSEKYSVYNP...
FRA0004_3
MKTSSVLALASFAAFATASPIAP...
FRA0004_4
MKFGTTLRKSVYAPWKDKYID...
FRA0004_5
MDTPPRVFETAVGKFWR ...
.
.
.
I really appreciate any help that you can provide to solve this,
Thank you so much
I installed SignalP as follows:
FROM nvidia/cuda:12.1.1-base-ubuntu20.04
MAINTAINER Thomas Roder
ADD software/signalp-6.0g.fast.tar.gz /opt/signalp
# install system dependencies
RUN apt update
RUN apt install -y python3.9 python3-distutils curl zip unzip less
RUN curl -sS https://bootstrap.pypa.io/get-pip.py | python3.9
# install python dependencies
RUN pip install -r /opt/signalp/signalp6_fast/signalp-6-package/requirements.txt
RUN pip install /opt/signalp/signalp6_fast/signalp-6-package
RUN mv /opt/signalp/signalp6_fast/signalp-6-package/models/* /usr/local/lib/python3.9/dist-packages/signalp/model_weights/
# replace python 3.8 with 3.9
RUN \
update-alternatives --install /usr/bin/python python /usr/bin/python3.8 1 && \
update-alternatives --install /usr/bin/python python /usr/bin/python3.9 2 && \
update-alternatives --set python /usr/bin/python3.9 && \
update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.8 1 && \
update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.9 2 && \
update-alternatives --set python3 /usr/bin/python3.9
# convert to GPU if GPU_MODE
ARG GPU_MODE
RUN if [ -n "$GPU_MODE" ]; then \
echo "GPU_MODE is ON: converting models..."; \
signalp6_convert_models gpu; \
else \
echo "GPU_MODE is OFF: do nothing"; \
fi
WORKDIR /data
Build with podman/CUDA.
podman build . --tag signalp_cpu
podman build --build-arg GPU_MODE=TRUE --security-opt=label=disable --hooks-dir=/usr/share/containers/oci/hooks.d/ . --tag signalp_gpu
podman run --rm \
-v ./:/data:Z \
signalp_cpu \
signalp6 --fastafile /data/test/predicted.faa --organism other --output_dir /data/test/out_cpu --format txt --mode fast
podman run --rm \
-v ./:/data:Z \
--security-opt=label=disable \
--hooks-dir=/usr/share/containers/oci/hooks.d/ \
signalp_gpu \
signalp6 --fastafile /data/test/predicted.faa --organism other --output_dir /data/test/out_gpu --format txt --mode fast
You can download the output here: signalp-output.tar.xz.
GPU mode gives many warnings like this:
Unknown behaviour encountered for sequence no. 11. Please check outputs.
Also, many proteins are classified differently, e.g.:
[CPU] 1_12 # 13606 # 13803 # -1 # ID=1_12;partial=00;start_type=ATG;rbs_motif=GGAG/GAGG;rbs_spacer=5-10bp;gc_cont=0.424 PILIN 0.004577 0.000000 0.000000 0.000000 0.000000 60031319932928.000000 CS pos: 10-11. Pr: 0.0000
[GPU] 1_12 # 13606 # 13803 # -1 # ID=1_12;partial=00;start_type=ATG;rbs_motif=GGAG/GAGG;rbs_spacer=5-10bp;gc_cont=0.424 OTHER 0.999923 0.000075 0.000001 0.000000 0.000000 0.000002
I suppose something is wrong with the GPU predictions. However, I want to annotate thousands of genomes and require the algorithm to be faster. Do you have an idea what went wrong? Can you help me?
$ cat /etc/os-release
NAME="Fedora Linux"
VERSION="38 (Workstation Edition)"
...
$ nvidia-smi
Wed May 17 20:47:27 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.41.03 Driver Version: 530.41.03 CUDA Version: 12.1 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce MX150 Off| 00000000:3B:00.0 Off | N/A |
| N/A 93C P0 N/A / N/A| 1659MiB / 2048MiB | 95% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 2530 G /usr/bin/gnome-shell 0MiB |
| 0 N/A N/A 102059 C /usr/bin/python3.9 1656MiB |
+---------------------------------------------------------------------------------------+
Hello,
I am trying to run signalp on a fasta file with 258514 sequences. However, I reached the number of sequences limit.
How can I increase the sequence limit in SignalP?
Thanks.
Hi there!
Sorry to bother you again.
I'm still running into issues with the decoding step.
Running this sequence with SignalP 6.0e raises an error:
>P000004B9
MAFRLFAGITGRQLLAGGAALGGTGLAGSLIQTESERLQATEAQVQFHTSSIHPTPVGFS
PWQIRNDYPTSDILKARLKAQKDDSLPNAPSPLIPAPGLPGDFEGENAPWFKYDYEKEPE
KFAEAIREYCFDGNVDKGFRLNENKIRDWYHAPWMHYRDPNSMCTEREPINGFTFERATP
AGEFAKTQNVTLQNWAIGFYNATGATVFGDMWKDPDNPDFSQNKEFPVGTCVFKILLNNS
TPEQMPIQDGAPTMHAVISKSTSNGKERNDFASPLRLIQVDFAVVDKRSPIGWVFGTFMY
NKDQPGKGPWDRLTLVGLQWGNDHWLTNQVYDETKAEGRVAKPRECYIHKKAEDIRKREG
GTRPSWGWNGRMNGPADNFISACASCHSTSTSHPMYNGKVKDGVKQTYGMVPPLNMKPLP
PQPKEGNTFSDVMIYFRNVMGGVPFDEGVNPNNPDEYDPTYKSKVKSADYSLQLQVGWAN
YKKWKEDHETVLQSIFRKTRYVIGSELAGASDLSQRDQGRQEPTDDGPVE
$signalp6 --fastafile "${1}" --output_dir "${TMPDIR}" --format none --organism eukarya --mode fast --bsize "32" --write_procs 1
Predicting: 100%|██████████| 1/1 [00:04<00:00, 4.87s/sequences]
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/predector/bin/signalp6", line 8, in <module>
sys.exit(predict())
File "/home/ubuntu/miniconda3/envs/predector/lib/python3.6/site-packages/signalp/__init__.py", line 6, in predict
main()
File "/home/ubuntu/miniconda3/envs/predector/lib/python3.6/site-packages/signalp/predict.py", line 239, in main
resolve_viterbi_marginal_conflicts(global_probs, marginal_probs, cleavage_sites, viterbi_paths)
File "/home/ubuntu/miniconda3/envs/predector/lib/python3.6/site-packages/signalp/utils.py", line 311, in resolve_viterbi_marginal_conflicts
cleavage_sites[i] = sp_idx.max() +1
File "/home/ubuntu/miniconda3/envs/predector/lib/python3.6/site-packages/numpy/core/_methods.py", line 39, in _amax
return umr_maximum(a, axis, None, out, keepdims, initial, where)
ValueError: zero-size array to reduction operation maximum which has no identity
I haven't had a huge amount of time to debug it (or decipher how it all works), but it seems as though the marginal probabilities in type_marginal_probs
are all assigning it to the PAD token, so you end up with a zero length array at np.where(np.isin(marginal_region_preds, [5, 10, 19, 25, 31]))[0]
.
I wonder if a property unit testing framework (like https://hypothesis.readthedocs.io/en/latest/) would be helpful for finding all of these edge cases and appropriately handle them?
It seems to have become a troublesome issue.
Hello,
I just ran the downloaded version of SignalP 6.0g and noticed that the TATLIPO column in prediction_results.txt
(printed by make_output_files.py:21
) reads "TATLIPO(Sec/SPII)" where it supposedly should output "TATLIPO(Tat/SPII)".
Best regards
I've installed Signal-6.0, but when running the command on my fasta file, it gives an error: FileNotFoundError: Slow mode requires the full model to be installed at ....., I'm looking for ensemble_model_signalp6.pt to resolve this error.
Please if you have any opinion or solution on this error let me know.
Users are running signalp-6.0 on our compute cluster. Each signalp process takes as many CPU cores as the hardware provides. The processes scale poorly if using >8 threads, especially if using more than 64 threads. Please provide a command line parameter that allows the user to define the number of CPU threads that signalp uses.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.