Giter Site home page Giter Site logo

subread_to_dexseq's Issues

geneID/featureID truncated

I saw a warning that one geneID got truncated, and in the R codes it said this is needed to match
featureCounts. Could this truncate step be skipped now?

featureCounts parameters

Hi Vivek.

Thanks for providing the Subread_to_DEXSeq scripts.

I believe there are two typos in the featureCounts example you give
/path/to/subread-1.4.6-p2/bin/featureCounts -f -O -s 2 -p -T 40 -F GTF -a dm6_ens76_flat.gtf -o dm6_fCount.out Cont_1.bam Cont_2.bam Test_1.bam Test_2.bam

  1. You use -s 2 but this would count reads if they are located on the opposite strand of the annotated feature. Are you sure this is correct?
  2. Additionally, if I use your dexseq_prepare_annotation2.py to prepare the flattened GTF file, I need to add -t exonic_part to the featureCounts arguments to perform counting at the exon level.

Cheers,
Maurits

FeatureCount parameters

Why is the Successfully assigned alignments so low 37.3% when i use featureCounts -f -O -p -s 2 -T 16 -F GTF -a ../vm25self.DEXSeq.gtf -o ../vm25selfallDEXSeqcount.txt test1.sort.bam test2.sort.bam
what is more, my data is paired-end, but commandline display The reads are assigned on the single-end mode

Process BAM file test1.sort.bam... ||
|| Strand specific : reversely stranded ||
|| Paired-end reads are included. ||
|| The reads are assigned on the single-end mode. ||
|| Total alignments : 106372557 ||
|| Successfully assigned alignments : 39673738 (37.3%) ||
|| Running time : 0.18 minutes

Thanks a lot

Error in DEXSeqDataSetFromFeatureCounts

Hi,

I'm trying to use your code to run DEXSeq using featureCounts output, but when I run

dxd.fc <- DEXSeqDataSetFromFeatureCounts("counts_fCount.out", flattenedfile = "featurecountsgtf",sampleData = samp)

I get the following error message:

Error in DESeqDataSet(rse, design, ignoreRank = TRUE) :
some values in assay are not integers
In addition: Warning message:
Error in DESeqDataSet(rse, design, ignoreRank = TRUE) :
some values in assay are not integers

Could you please help me out with this as I'm not sure what the problem is.

Thanks,
Sneha

GTF to GFF transformation

I downloaded the Homo_sapiens.GRCh38.102.gtf from Ensembl, but I am getting this error for the GTF to GFF transformation.

python /opt/anaconda3/lib/R/library/DEXSeq/python_scripts/dexseq_prepare_annotation.py Homo_sapiens.GRCh38.102.gtf dexseq.gff

File "/opt/anaconda3/lib/R/library/DEXSeq/python_scripts/dexseq_prepare_annotation.py", line 129
raise ValueError, "Same name found on two chromosomes: %s, %s" % ( str(l[i]), str(l[i+1]) )

^
SyntaxError: invalid syntax

so how can I to fix it?
thank you!

long names of gene sets in the flattened gtf stop the R script

Hey Vivek,
Thanks for publishing this nice hack for combining the two useful genomic tools!

I have tried it with hg38 from Encembl v86. The flattened annotation file happens to have one set having the complete name 351 chars long - it is PCDHG "cascade of genes". This gets truncated to 256 chars by featureCounts and causes your script to stop due not matching names.

I have deleted this set from the gtf together with its 99 exons (as likely misleading for splicing anyway) and did the counting with such gtf, then everything worked OK. But perhaps there can be more polite solution your own way, such as truncating all names(exoninfo) to 256 characters etc....

Cheers,
Michal

"-t" argument, exonic_part

Though I saw a closed issue said the exonic_part should be added, I found the dexseq_prepare_annotation2.py does NOT work with that argument. There is no exonic_part
in the dm6_ens76_flat.gtf at all, it only appears in the dm6_ens76_flat.gff

Am I right that the "-t" parameter should be deleted?

Something wrong with featureCounts

Hi,
firstly, thanks for your wonderful scripts.

I'm trying to use dexseq_prepare_annotation2.py to prepare the needed GTF for featureCounts and DEXSeq, my primary GTF file is Homo_sapiens.GRCh37.75.gtf which downloaded from ensembl.Then, This step is running successfully whithout any error.

However, when I ran featureCounts, something wrong happened, WARNING was reported as follow:
image

I wonder if this mistake gets in the way? Will it affect the subsequent analysis? Since my subsequent analysis also reported errors, I don't know what went wrong.

dxd.fc <- DEXSeqDataSetFromFeatureCounts("~/dexseq/test_out3.txt",flattenedfile = "~/annotation/GRCh37_feature.gtf",sampleData = samp)

Reading and adding Exon IDs for DEXSeq
Error in DEXSeqDataSetFromFeatureCounts("~/dexseq/test_out3.txt", flattenedfile = "~/annotation/GRCh37_feature.gtf", :
Count files do not correspond to the flattened annotation file
此外: Warning message:
1 aggregate geneIDs were found truncated in featureCounts outpu

Thanks
Jessian

Error in source("load_SubreadOutput.R")

Hi im new in this and i got yhis error, please help me

Error in source("load_SubreadOutput.R") :
load_SubreadOutput.R:1:40: unexpected ','
1: {"payload":{"allShortcutsEnabled":false,

Subread_to_DEXseq problem

Hi,

I'm trying to use your program in order to use featureCount as input for DEXseq, but I'm facing some troubles with it.

I'm working with Homo_sapiens.GRCh38.91.chr.gtf release from ensembl

capture d ecran 2018-05-23 a 15 08 57

I followed the tutorial, as I am on mac, I modify the dexseq_prepare_annotation2.py, by adding a .bak after 'sed -i.....' -> 'sed -i.bak....' at the end of the script to do not have error during the GTF preparation.
So I used the following command :

capture d ecran 2018-05-23 a 15 19 54

And after that start to use featureCounts-1.6

capture d ecran 2018-05-23 a 15 16 40

As you can see, featureCounts doesn't support the GTF file that generate the script. Do you have any idea what's going wrong with this procedure ?

Thanks for the help, and the tutorial
Nicolas

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.