Giter Site home page Giter Site logo

*_genes.out duplicate genes about tpmcalculator HOT 3 OPEN

ncbi avatar ncbi commented on May 26, 2024 1
*_genes.out duplicate genes

from tpmcalculator.

Comments (3)

dodoflyy avatar dodoflyy commented on May 26, 2024 1

Then how TPMCalculator identify gene copies? Here my GTF record only the first row's feature is "gene" others are "transcript"
And I can't see anything special of last 2 transcript from attributes.

$ grep "ENSG00000235538" gencode.v35.annotation.gtf | awk -v FS="\t" '$3!="exon" {print $9}'
gene_id "ENSG00000235538.3"; gene_type "lncRNA"; gene_name "AL078602.1"; level 2; tag "ncRNA_host"; havana_gene "OTTHUMG00000195978.1";
gene_id "ENSG00000235538.3"; transcript_id "ENST00000665613.1"; gene_type "lncRNA"; gene_name "AL078602.1"; transcript_type "lncRNA"; transcript_name "AL078602.1-201"; level 2; tag "TAGENE"; havana_gene "OTTHUMG00000195978.1"; havana_transcript "OTTHUMT00000518669.1";
gene_id "ENSG00000235538.3"; transcript_id "ENST00000671100.1"; gene_type "lncRNA"; gene_name "AL078602.1"; transcript_type "lncRNA"; transcript_name "AL078602.1-202"; level 2; tag "basic"; tag "TAGENE"; havana_gene "OTTHUMG00000195978.1"; havana_transcript "OTTHUMT00000522294.1";
gene_id "ENSG00000235538.3"; transcript_id "ENST00000657614.1"; gene_type "lncRNA"; gene_name "AL078602.1"; transcript_type "lncRNA"; transcript_name "AL078602.1-203"; level 2; tag "basic"; tag "TAGENE"; havana_gene "OTTHUMG00000195978.1"; havana_transcript "OTTHUMT00000507375.1";
gene_id "ENSG00000235538.3"; transcript_id "ENST00000665405.1"; gene_type "lncRNA"; gene_name "AL078602.1"; transcript_type "lncRNA"; transcript_name "AL078602.1-204"; level 2; tag "TAGENE"; havana_gene "OTTHUMG00000195978.1"; havana_transcript "OTTHUMT00000505809.1";
gene_id "ENSG00000235538.3"; transcript_id "ENST00000669147.1"; gene_type "lncRNA"; gene_name "AL078602.1"; transcript_type "lncRNA"; transcript_name "AL078602.1-205"; level 2; tag "TAGENE"; havana_gene "OTTHUMG00000195978.1"; havana_transcript "OTTHUMT00000506093.1";
gene_id "ENSG00000235538.3"; transcript_id "ENST00000657157.1"; gene_type "lncRNA"; gene_name "AL078602.1"; transcript_type "lncRNA"; transcript_name "AL078602.1-206"; level 2; tag "TAGENE"; havana_gene "OTTHUMG00000195978.1"; havana_transcript "OTTHUMT00000512883.1";
gene_id "ENSG00000235538.3"; transcript_id "ENST00000667749.1"; gene_type "lncRNA"; gene_name "AL078602.1"; transcript_type "lncRNA"; transcript_name "AL078602.1-207"; level 2; tag "TAGENE"; havana_gene "OTTHUMG00000195978.1"; havana_transcript "OTTHUMT00000506923.1";
gene_id "ENSG00000235538.3"; transcript_id "ENST00000664207.1"; gene_type "lncRNA"; gene_name "AL078602.1"; transcript_type "lncRNA"; transcript_name "AL078602.1-208"; level 2; tag "basic"; tag "TAGENE"; havana_gene "OTTHUMG00000195978.1"; havana_transcript "OTTHUMT00000509534.1";
gene_id "ENSG00000235538.3"; transcript_id "ENST00000659903.1"; gene_type "lncRNA"; gene_name "AL078602.1"; transcript_type "lncRNA"; transcript_name "AL078602.1-209"; level 2; tag "basic"; tag "TAGENE"; havana_gene "OTTHUMG00000195978.1"; havana_transcript "OTTHUMT00000508554.1";
gene_id "ENSG00000235538.3"; transcript_id "ENST00000666400.1"; gene_type "lncRNA"; gene_name "AL078602.1"; transcript_type "lncRNA"; transcript_name "AL078602.1-210"; level 2; tag "basic"; tag "TAGENE"; havana_gene "OTTHUMG00000195978.1"; havana_transcript "OTTHUMT00000521262.1";
gene_id "ENSG00000235538.3"; transcript_id "ENST00000452944.1"; gene_type "lncRNA"; gene_name "AL078602.1"; transcript_type "lncRNA"; transcript_name "AL078602.1-211"; level 2; transcript_support_level "5"; havana_gene "OTTHUMG00000195978.1"; havana_transcript "OTTHUMT00000043020.1";
gene_id "ENSG00000235538.3"; transcript_id "ENST00000669856.1"; gene_type "lncRNA"; gene_name "AL078602.1"; transcript_type "lncRNA"; transcript_name "AL078602.1-212"; level 2; tag "TAGENE"; havana_gene "OTTHUMG00000195978.1"; havana_transcript "OTTHUMT00000512193.1";
gene_id "ENSG00000235538.3"; transcript_id "ENST00000657138.1"; gene_type "lncRNA"; gene_name "AL078602.1"; transcript_type "lncRNA"; transcript_name "AL078602.1-213"; level 2; tag "TAGENE"; havana_gene "OTTHUMG00000195978.1"; havana_transcript "OTTHUMT00000513259.1";
gene_id "ENSG00000235538.3"; transcript_id "ENST00000659063.1"; gene_type "lncRNA"; gene_name "AL078602.1"; transcript_type "lncRNA"; transcript_name "AL078602.1-214"; level 2; tag "TAGENE"; havana_gene "OTTHUMG00000195978.1"; havana_transcript "OTTHUMT00000527573.1";
gene_id "ENSG00000235538.3"; transcript_id "ENST00000654484.1"; gene_type "lncRNA"; gene_name "AL078602.1"; transcript_type "lncRNA"; transcript_name "AL078602.1-215"; level 2; tag "basic"; tag "TAGENE"; havana_gene "OTTHUMG00000195978.1"; havana_transcript "OTTHUMT00000527575.1";

from tpmcalculator.

r78v10a07 avatar r78v10a07 commented on May 26, 2024

Hi,
This is the normal way TPMCalculator quantify RNA-Seq abundance on copies for a same gene. If you look at the output, the third column is the starting coordinate of the gene. In your example, each gene copy starts in a different position. TPMCalculator uses #1, #2, #3 ... to identify the copies.

from tpmcalculator.

r78v10a07 avatar r78v10a07 commented on May 26, 2024

We identify the copies using the genomic coordinates. If the transcripts of a same gene are in different genomic region and they don't overlap we mark that as a copy of the same gene

from tpmcalculator.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.