Giter Site home page Giter Site logo

ppgbiotec_bioinformatica_i's Introduction

Bioinformatics 101 -- PPGBiotec

Summary:

This repository contains the scripts for the discipline of Bioinformatics applied in the postgrad course in Biotechnology.


Author: Célio Dias Santos Jr.

Email: [email protected]

Date: 7th September 2023

Location: São Carlos, SP/Brazil


Install to run on Windows

To run this script, you can follow the instructions for the VSCode and Python installation available in the youtube:

Instructions for installation

Script: 2nd_class_script.py

To run this script you will only need the numpy package. Alternatively you may want to plot the profiles generated by using either Excel or dedicated packages in python, such as Seaborn and Matplotlib.

To install numpy, you only need to:

$ pip install numpy seaborn matplotlib pandas

To run it locally, you will only need:

$ python3 2nd_class_script.py

And it will generate the results in the screen along with several profiles as txt files.

You alternatively can use the functions to calculate your own features of interest separately for sequences you judge important, like this:

from 2nd_class_script import solubility

sequence = 'AKAKAKAKAKAAK'

solubility(sequence, verbose=True)

I recommend you explore the functions available, defined in the script with the term def in the beginning of the line.

Summarized here, follow the functions that you can use as in the above mentioned example:

Function name Description Inputs
calc_aafreq Calculates the frequencies of amino acids and test their fold-change against the average composition of water soluble proteins calc_aafreq(seq)
mw Calculates the molecular weight of proteins in kDa mw(seq, verbose=True)
prot_charge Calculates the protein charge in a given pH prot_charge(seq, pH=7.0, verbose=True)
pI Calculates the isoelectric point of a protein pI(seq, verbose=True)
mol_extinc Calculates the molecular extinction coefficient of a protein in the two states redox mol_extinc(seq, verbose=True)
profile Generates a profile of a protein according to the values of a selected scale [See note] profile(seq, scale, output=f'profile_{scale_name}.tsv', window=5)
antigenicity Calculates the segments potentially antigenic from a sequence with the Kolaskar method antigenicity(seq, verbose=True)
hydrophobicity Calculates the average hydrophobicity of a protein using a given scale hydrophobicity(seq, scale='KD', verbose=True)
solubility Estimates the solubility of a given protein by using hydrophobicity inferences solubility(seq, verbose=True)
radius Calculates the radius for globular and linear (gyration) proteins radius(seq, verbose=True)
sp_vol Calculates the specific volume of a protein in mL/g sp_vol(seq, verbose=True)
pckg_vol Calculates the packed volume of a protein in A^3 pckg_vol(seq, verbose=True)
instability_index Calculates the estability of a protein using instability index (stable if < 40) instability_index(seq, verbose=True)
aliphatic_index Calculates the aliphatic index, which gives insights about thermostability aliphatic_index(seq, verbose=True)
boman_index Calculates Boman index, which gives insights about the protein-protein or membrane-protein interaction boman_index(seq, verbose=True)
hmoment Calculates hydrophobic moment, which gives insights about structural organization hmoment(seq, angle=100, window=11, verbose=True)
motif_finder Searches for a motif sequence inside a protein, returning their start/stop and match motif_finder(motif, seq, output=False, verbose=True)

Note

You can adopt several scales for the calculations, previously available in the system, such as:

Measurement Scale name
flexibility bfactors
antigenicity parker
antigenicity kolaskar
hydrophobicity kyte_doolitle
hydrophobicity hopp_woods
hydrophobicity cornette
hydrophobicity eisenberg
hydrophobicity rose
hydrophobicity janin
hydrophobicity engelman

When calculating profiles, you can modify the three variables in the lines 485-487 in the main function:

profile_graphs = True
profile_numbers = False
profile_tables = False

These lines will if the first is set to True generate graphs using seaborn in the local it is executed, if the second is set to True, it will generate tables printed on the screen, and if the 3rd is set to True you can save the results as a .tsv file.

In all cases you can get the same results for a profile using a given scale.

ppgbiotec_bioinformatica_i's People

Contributors

celiosantosjr avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.