Giter Site home page Giter Site logo

libpredweb's Introduction

libpredweb

This is the library for the protein prediction servers used by the following repos

How to install

Use the following command to install the package with pip

pip install git+https://github.com/nanjiangshu/libpredweb.git@main

Usage

Add the following lines to your Python script in order to use the library

from libpredweb import myfunc
from libpredweb import webserver_common as webcom
from libpredweb import qd_fe_common as qdcom

libpredweb's People

Contributors

nanjiangshu avatar inghylt avatar dependabot[bot] avatar

Watchers

Erik Ylipää avatar Valentin Georgiev avatar James Cloos avatar Johan avatar Malin Klang avatar Björn avatar  avatar Jonas Hagberg avatar Johan Nylander avatar  avatar  avatar  avatar Per Johnsson avatar  avatar Jessica Lindvall avatar John Lövrot avatar  avatar Jonas Söderberg avatar  avatar Agustín Andrés Corbat avatar  avatar Fredrik Levander avatar Airen Zaldivar Peraza avatar  avatar Juliana Assis avatar Dag Ahren avatar Markus Ringnér avatar  avatar Olga avatar  avatar Ashfaq Ali avatar Emilio Mármol Sánchez avatar  avatar

libpredweb's Issues

Investigate why subcons failes to predict for some queries

The subcons web-server is failed to predict some of the queries,

such as

>t2
IEVYHSYFWPLEWTIPSRDNNKCYAKIICNTKDNERVVGFHVLGPNAGEVTQGFAAALKCMAQRLLLRRFLASVISRKPSQGQWPPLTSRALQTPQCSPGGLTVTPNPARTIYTTRISLTTFNIQDGPDFQDRVVNSETPVVVDFHAQWCGPCKILGPRLEKMVAKQHGKVVMAKVDIDDHTDLAIEYEAGSLSPRLECSGTITAH
>lcl|ORF34
MEIEGPLPNHHGNIPKNVPFLIGQKARYLFLSVFSMAVIYFTRSPLMGLTHFSAYILITGCIFSFSTLDDTLSSQPTVQCFYGHAHSSTILVPQYFSLQSQQSIPLLPPGMATPCLATNIAKLDTTSTLLTGNRCCSHLCHCCRSKLSLPTTSMSAFFYHQFSVHCSVYL

However, it works with the default example

>sp|P04201|MAS_HUMAN
MDGSNVTSFVVEEPTNISTGRNASVGNAHRQIPIVHWVIMSISPVGFVENGILLWFLCFRMRRNPFTVYITHLSIADISLLFCIFILSIDYALDYELSSGHYYTIVTLSVTFLFGYNTGLYLLTAISVERCLSVLYPIWYRCHRPKYQSALVCALLWALSCLVTTMEYVMCIDREEESHSRNDCRAVIIFIAILSFLVFTPLMLVSSTILVVKIRKNTWASHSSKLYIVIMVTIIIFLIFAMPMRLLYLLYYEYWSTFGNLHHISLLFSTINSSANPFIYFFVGSSKKKRFKESLKVVLTRAFKDEMQPRRQKDNCNTVTVETVV

Figure out the problem and try to fix it.

duplicated processing for archived results

Retrieving results from archived results is sometimes run twice, the consequence is that duplicated records are found in the file finished_seqs.txt.

DoD: The archived results are processed only once

use new pdb2seq function

The old pdb2seq() function based on biopython PDBparser has a bug when the PDB file has atom record containing e.g. HD11, that is, the 13th column is not empty and with letter 'H'.

We need to write a new function that will solve this problem

DOD: the new pdb2seq() function parse the PDB files correctly for even the aforementioned cases.

Lock file not always cleaned for python pipelines

Description
Some pipelines may be blocked from running since the lock files are not cleaned after unexpected exit. This should be fixed so that the lock files are always cleaned at exit, even when it's an abnormal exit.

Applied Python scripts

  • clean_cached_result.py
  • job_final_process.py
  • run_server_statistics.py

Re-build VM for the computational node on RECAS-BARI

As a developer, I want to the computational node on the RECAS-BARI work as expected. The previous setting with shared NFS storage is good for scalability, but the VM crashes frequently because the intensive I/O on the NFS storage. Therefore, the VMs on those nodes needs to be rebuild without shared storage.

Fix vulnerabilities with Django

As a developer, I want Django to be updated so that the vulnerabilities are resolved.

Related links:

https://github.com/NBISweden/predictprotein-webserver-boctopus2/security/dependabot/requirements.txt/django/open
https://github.com/NBISweden/predictprotein-webserver-scampi2/security/dependabot/requirements.txt/django/open

It is probably valid for the other web-servers as well.

For SCAMPI and BOCTOPUS2, please note that the current code base is not compatible with the latest Django, i.e. 2.2.24.

  • pconsc3
  • scampi
  • topcons
  • boctopus
  • proq3
  • subcons
  • prodres
  • common_backend

add function urlretrieve with timeout

the urllib version of urlretrieve will hang when the network is bad. We need to add a function for downloading file from url with timeout.

DOD: add a function urlretrieve in the module myfunc with timeout option.

port security for the backend server

Use port other than 80 for the backend servers

original request from the EGI provider

the following alert has been received from GARR about some issues on your instance on ReCaS-Bari Cloud with IP 212.189.205.55 (TENANT: EGI_bils). We kindly ask you to solve urgently since the deadline is on November 18th 2022. The issues are reported below:
-----------

User: 4eaa8b41022d8c62fa5762151bad366ca87efba29e4f702a52ff2fae5a2e46fd@egi.eu_egiID
Tenant: EGI_bils

--- Inizio REPORT ---

Sede: ba
Ip: 212.189.205.55
SSLv2: 
SSLv3: 
TLS1: 
TLS1.1: 
TLS1.2: 
TLS1.3: 

--- Fine REPORT -----

TO KEEP YOUR CONFIGURATION SAFE: 
   
 1) if a service is ACTIVE on port 80 THERE MUST BE a redirection (automatically or not)
to port 443 
   
 2) if a service is ACTIVE on port 443 MUST
    * ACCEPT only the following protocols:
      - TLS 1.2 
      - TLS 1.3 
    * REJECT the following protocols
      - SSLv2  
      - SSLv3 
      - TLS1 
      - TLS1.1 
 
The following tools are available:  
  * to safely configure the server : 
    https://ssl-config.mozilla.org/ 
  * to check the solution adopted: 
    https://github.com/drwetter/testssl.sh/tree/3.0 

We kindly ask you to give as a feedback about the actions undertaken as soon as posslible.
Regards

ReCaS-Bari Support Team

Run job completion procedure in a separate process for TOPCONS

When a job contains many (e.g. >1000) sequences, the procedure to create the dumped result and making a zip file may take quite some time and thus blocking other jobs. It will be better to run the procedure in a separate process, in a queuing system.

Related code:
https://github.com/NBISweden/libpredweb/blob/master/libpredweb/qd_fe_common.py#L1376-L1437
https://github.com/NBISweden/predictprotein-webserver-topcons2/blob/master/proj/pred/app/qd_fe.py#L1164-L1234

Backup web-server usage data files

As a developer, I want a proper way to backup the usage data files for prediction web-servers.
These files were previously backed up on the computer shu.scilifelab.se but it has been shut down now.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.