vivekbhr / subread_to_dexseq Goto Github PK

Scripts to import your FeatureCounts output into DEXSeq

License: GNU General Public License v3.0

R 37.39% Python 62.61%

subread_to_dexseq's Issues

geneID/featureID truncated

I saw a warning that one geneID got truncated, and in the R codes it said this is needed to match
featureCounts. Could this truncate step be skipped now?

featureCounts parameters

Hi Vivek.

Thanks for providing the Subread_to_DEXSeq scripts.

I believe there are two typos in the featureCounts example you give
/path/to/subread-1.4.6-p2/bin/featureCounts -f -O -s 2 -p -T 40 -F GTF -a dm6_ens76_flat.gtf -o dm6_fCount.out Cont_1.bam Cont_2.bam Test_1.bam Test_2.bam

You use -s 2 but this would count reads if they are located on the opposite strand of the annotated feature. Are you sure this is correct?
Additionally, if I use your dexseq_prepare_annotation2.py to prepare the flattened GTF file, I need to add -t exonic_part to the featureCounts arguments to perform counting at the exon level.

Cheers,
Maurits

FeatureCount parameters

Why is the Successfully assigned alignments so low 37.3% when i use featureCounts -f -O -p -s 2 -T 16 -F GTF -a ../vm25self.DEXSeq.gtf -o ../vm25selfallDEXSeqcount.txt test1.sort.bam test2.sort.bam
what is more, my data is paired-end, but commandline display The reads are assigned on the single-end mode

Process BAM file test1.sort.bam... ||
|| Strand specific : reversely stranded ||
|| Paired-end reads are included. ||
|| The reads are assigned on the single-end mode. ||
|| Total alignments : 106372557 ||
|| Successfully assigned alignments : 39673738 (37.3%) ||
|| Running time : 0.18 minutes

Thanks a lot

Error in DEXSeqDataSetFromFeatureCounts

Hi,

I'm trying to use your code to run DEXSeq using featureCounts output, but when I run

dxd.fc <- DEXSeqDataSetFromFeatureCounts("counts_fCount.out", flattenedfile = "featurecountsgtf",sampleData = samp)

I get the following error message:

Error in DESeqDataSet(rse, design, ignoreRank = TRUE) :
some values in assay are not integers
In addition: Warning message:
Error in DESeqDataSet(rse, design, ignoreRank = TRUE) :
some values in assay are not integers

Could you please help me out with this as I'm not sure what the problem is.

Thanks,
Sneha

GTF to GFF transformation

I downloaded the Homo_sapiens.GRCh38.102.gtf from Ensembl, but I am getting this error for the GTF to GFF transformation.

python /opt/anaconda3/lib/R/library/DEXSeq/python_scripts/dexseq_prepare_annotation.py Homo_sapiens.GRCh38.102.gtf dexseq.gff

File "/opt/anaconda3/lib/R/library/DEXSeq/python_scripts/dexseq_prepare_annotation.py", line 129
raise ValueError, "Same name found on two chromosomes: %s, %s" % ( str(l[i]), str(l[i+1]) )

^
SyntaxError: invalid syntax

so how can I to fix it?
thank you!

long names of gene sets in the flattened gtf stop the R script

Hey Vivek,
Thanks for publishing this nice hack for combining the two useful genomic tools!

I have tried it with hg38 from Encembl v86. The flattened annotation file happens to have one set having the complete name 351 chars long - it is PCDHG "cascade of genes". This gets truncated to 256 chars by featureCounts and causes your script to stop due not matching names.

I have deleted this set from the gtf together with its 99 exons (as likely misleading for splicing anyway) and did the counting with such gtf, then everything worked OK. But perhaps there can be more polite solution your own way, such as truncating all names(exoninfo) to 256 characters etc....

Cheers,
Michal

"-t" argument, exonic_part

Though I saw a closed issue said the exonic_part should be added, I found the dexseq_prepare_annotation2.py does NOT work with that argument. There is no exonic_part
in the dm6_ens76_flat.gtf at all, it only appears in the dm6_ens76_flat.gff

Am I right that the "-t" parameter should be deleted?

Something wrong with featureCounts

Hi,
firstly, thanks for your wonderful scripts.

I'm trying to use dexseq_prepare_annotation2.py to prepare the needed GTF for featureCounts and DEXSeq, my primary GTF file is Homo_sapiens.GRCh37.75.gtf which downloaded from ensembl.Then, This step is running successfully whithout any error.

However, when I ran featureCounts, something wrong happened, WARNING was reported as follow:

I wonder if this mistake gets in the way? Will it affect the subsequent analysis? Since my subsequent analysis also reported errors, I don't know what went wrong.

dxd.fc <- DEXSeqDataSetFromFeatureCounts("~/dexseq/test_out3.txt",flattenedfile = "~/annotation/GRCh37_feature.gtf",sampleData = samp)

Reading and adding Exon IDs for DEXSeq
Error in DEXSeqDataSetFromFeatureCounts("~/dexseq/test_out3.txt", flattenedfile = "~/annotation/GRCh37_feature.gtf", :
Count files do not correspond to the flattened annotation file
此外: Warning message:
1 aggregate geneIDs were found truncated in featureCounts outpu

Thanks
Jessian

Error in source("load_SubreadOutput.R")

Hi im new in this and i got yhis error, please help me

Error in source("load_SubreadOutput.R") :
load_SubreadOutput.R:1:40: unexpected ','
1: {"payload":{"allShortcutsEnabled":false,

problem for DEXseq

Subread_to_DEXseq problem

Hi,

I'm trying to use your program in order to use featureCount as input for DEXseq, but I'm facing some troubles with it.

I'm working with Homo_sapiens.GRCh38.91.chr.gtf release from ensembl

I followed the tutorial, as I am on mac, I modify the dexseq_prepare_annotation2.py, by adding a .bak after 'sed -i.....' -> 'sed -i.bak....' at the end of the script to do not have error during the GTF preparation.
So I used the following command :

And after that start to use featureCounts-1.6

As you can see, featureCounts doesn't support the GTF file that generate the script. Do you have any idea what's going wrong with this procedure ?

Thanks for the help, and the tutorial
Nicolas

vivekbhr / subread_to_dexseq Goto Github PK

subread_to_dexseq's Issues

geneID/featureID truncated

featureCounts parameters

FeatureCount parameters

Error in DEXSeqDataSetFromFeatureCounts

GTF to GFF transformation

long names of gene sets in the flattened gtf stop the R script

"-t" argument, exonic_part

Something wrong with featureCounts

Error in source("load_SubreadOutput.R")

problem for DEXseq

Subread_to_DEXseq problem

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent