Giter Site home page Giter Site logo

ezrun's People

Contributors

agi0917 avatar brisk022 avatar codingkaiser avatar daymegr avatar ge11232002 avatar giancarlorussofgcz avatar hrehrauer avatar kalinnonchev avatar masaomi avatar masaomihatakeyama avatar miqg avatar opitzl avatar p-gueguen avatar pdschmid avatar peterleary avatar weihongqi avatar zajacn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ezrun's Issues

Fisher's exact test for DEG

For the up- and downregulated genes separately, run a Fisher's exact
test to see if the regulated genes overlap with the high/low GC, short
long genes. i.e.

gcTypes = data.frame("GC < 0.4"=as.numeric(rowData(rawData)$gc) < 0.4,
                         "GC > 0.6"=as.numeric(rowData(rawData)$gc) > 0.6,
                         check.names=FALSE)
widthTypes = data.frame("width <
800nt"=as.numeric(rowData(rawData)$width) < 800,
                            "width >
2000nt"=as.numeric(rowData(rawData)$width) > 2000,
                            check.names=FALSE)

That should be part of the report. If there is a significant overlap.
The result is suspicious and potentially an artefact.

ncPRO app requires "Read Count" column

When the "Read Count" column is absent from the dataset definition, the app stops with
Error in `$<-.data.frame`(`*tmp*`, removed, value = integer(0))

I suspect this is due too this part of the app:

ezRun/R/app-ncpro.R

Lines 88 to 100 in 6992d62

if (!is.null(dataset$"Read Count") && is.numeric(dataset$"Read Count") && all(dataset$"Read Count" > 0)){
readCounts$untrimmed = dataset[samples, "Read Count"]
}
readCounts$remaining = countReadsInFastq(trimmedFastqFiles)
ezWrite.table(readCounts, "trimCounts.txt")
readCounts$removed = readCounts$untrimmed - readCounts$remaining
# plotCmd = expression({
# par(mar=c(12, 4.1, 4.1, 2.1))
# barplot(t(as.matrix(readCounts[ , c("remaining", "removed")])), las=2, border=NA,
# main="Read Counts after trimming", legend.text=TRUE, col=c("gray30", "gray"))
# })
# unusedLink = ezImageFileLink(plotCmd, file=param$readCountsBarplot, width=400 + nrow(readCounts) * 10, height=700)

(More specifically line 93)

I haven't tried, but putting it under the line 88 conditional could probably solve the issue; alternatively, making Read Count a necessary column (e.g. in the sushi app) obviously does solve the problem.

setUnlistDataNames warnings

Why there are warnings from RnaBamStats app? possibly from getRangesCoverage.
The results looks fine.

40: In setUnlistDataNames(x@unlistData, x@partitioning, use.names,  ... :
  failed to set names on the unlisted CompressedRleList object

2-pass mapping for STAR

Do Per-sample 2-pass mapping for now.
Multi-sample 2-pass mapping is not feasible with current sushi star mapping setup.

"Using isoform level annotation and aggregating." is too slow

Need a faster implementation.
Example:

setwd("/scratch/gtan/debug/p2710-CountQC")
library(ezRun)
param = list()
param[['cores']] = '1'
param[['ram']] = '2'
param[['scratch']] = '10'
param[['node']] = ''
param[['process_mode']] = 'DATASET'
param[['name']] = 'Count_QC'
param[['refBuild']] = 'p2710_Perma/FGCZ/Metabat_20180427/Annotation/Release_2018_04_30'
param[['refFeatureFile']] = 'genes.gtf'
param[['featureLevel']] = 'gene'
param[['normMethod']] = 'logMean'
param[['runGO']] = 'true'
param[['backgroundExpression']] = '10'
param[['transcriptTypes']] = ''
param[['specialOptions']] = ''
param[['expressionName']] = ''
param[['mail']] = '[email protected]'
param[['dataRoot']] = '/srv/gstore/projects'
param[['resultDir']] = 'p2710/CountQC_26335_2018-05-05--15-31-20'
output = list()
output[['Name']] = 'Count_QC'
output[['Species']] = 'NA'
output[['refBuild']] = 'p2710_Perma/FGCZ/Metabat_20180427/Annotation/Release_2018_04_30'
output[['Static Report [Link]']] = 'p2710/CountQC_26335_2018-05-05--15-31-20/Count_QC/00index.html'
output[['Live Report [Link]']] = 'http://fgcz-shiny.uzh.ch/fgcz_exploreCountQC_app/?data=p2710/CountQC_26335_2018-05-05--15-31-20/Count_QC/counts-qjliarzqgzfz-EzResult.RData'
output[['Report [File]']] = 'p2710/CountQC_26335_2018-05-05--15-31-20/Count_QC'
input = '/srv/gstore/projects/p2710/CountQC_26335_2018-05-05--15-31-20/input_dataset.tsv'
EzAppCountQC$new()$run(input=input, output=output, param=param)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.