Giter Site home page Giter Site logo

Extract protein seq about gff3toolkit HOT 4 CLOSED

nal-i5k avatar nal-i5k commented on June 4, 2024
Extract protein seq

from gff3toolkit.

Comments (4)

mpoelchau avatar mpoelchau commented on June 4, 2024

Hi @bsb2014 - The code for handling iupac bases in protein translation is here:

CODONS.extend(['GCN', 'TGY', 'GAY', 'GAR', 'TTY', 'GGN', 'CAY', 'ATH', 'AAR', 'TTR', 'CTN', 'YTR', 'AAY', 'CCN', 'CAR', 'CGN', 'AGR', 'MGR', 'TCN', 'AGY', 'ACN', 'GTN', 'NNN', 'TAY', 'TAR', 'TRA']) # IUB Depiction

Basically, we build an array of all possible codon combinations of ATGC in line 56; then add all known codons with iupac bases to that array in line 57. Line 58 defines an array with the amino acid symbols (in matched order to the array of codons), then line 59 combines the 2 arrays into a dictionary.

Hth, let us know if you have any questions!

from gff3toolkit.

bsb2014 avatar bsb2014 commented on June 4, 2024

Is it possible to add a function to translate codons with iupac bases to "-"? Thanks

from gff3toolkit.

mpoelchau avatar mpoelchau commented on June 4, 2024

Do you mean IUPAC symbols that code for multiple bases (e.g. R stands for G or A)? Unfortunately we don't have capacity to develop new features for this package. Perhaps you could change symbols that code for multiple bases to N's in your nucleotide sequence, if you have valid reasons to do this. I wouldn't change the nucleotide sequence or protein translation without a really good reason, though.

from gff3toolkit.

bsb2014 avatar bsb2014 commented on June 4, 2024

As for my case, the IUPAC bases (e.g. R) in the fasta means sequencing errors or heterozygous regions. I would like to label the amino acids if they were translated from these IUPAC bases. Thanks

from gff3toolkit.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.