almaaslab / csdr Goto Github PK

View Code? Open in Web Editor NEW

6.0 6.0 1.0 1.9 MB

Efficient CSD implementation in R

Home Page: https://almaaslab.github.io/csdR

License: GNU General Public License v3.0

R 55.12% C++ 26.81% TeX 18.07%

csdr's People

Contributors

Stargazers

Watchers

Forkers

jalilsharif

csdr's Issues

Quick question

Hi, this is a super useful package, so thank you very much. Sorry for the back and forth - answered some of my own questions. I do have one new one: I get the results where you have a higher C score (and low S/D), and those where you have a higher D score (and low S/D) - how would you interpret those results that have both higher C and S scores (but low D scores)?

The input data format for csdR

Hi Developers

I wish to use your package. I have raw counts for two samples (control vs knockout) each with 2 biological replicate (i.e. 4 feature count files). But I am not sure how to give these 4 files to run_csd function. While viewing the data that's provided with the package I realize the counts are not raw counts but instead some non-integer values with signs.

> sick_expression[1:10,1:10]
                 TMEM187       IKZF1      TRPV1     CYP2A6       RYR2
TCGA.DJ.A13L  0.25886910  1.31385680 -0.1271020 -0.6834881 -0.4114891
TCGA.DE.A4M9  0.59208636 -1.47097100  1.2461969  0.4717787  1.2144106
TCGA.J8.A3O1 -0.19256716 -0.72812927  0.4885138 -0.7809972 -0.3792373
TCGA.ET.A39K  1.31385680  0.05221925 -1.0537053 -0.5627220 -0.4006925
TCGA.4C.A93U -3.02590166  0.74770917  0.9013970  0.8291135 -0.6524206
TCGA.ET.A25I  0.59802017  0.61594986 -0.5053870  0.4114891  0.2588691
TCGA.ET.A2MZ  0.08710257 -0.21795197 -0.1925672 -0.4885138 -0.4829205
TCGA.EM.A3AK -0.30022446  0.20778210 -0.7742712 -0.8082633  0.3158622
TCGA.FK.A4UB -1.40127989 -1.22486871 -0.5395762 -1.7073658 -0.1522012
TCGA.BJ.A28X -0.83614405  2.44768351 -0.6340798 -0.2434780  1.7181407
                   CDK12     OGFRL1       ATAD1      PPDPFL        LATS1
TCGA.DJ.A13L -0.32633023  0.7477092  0.06217758 -0.09958441  0.107080851
TCGA.DE.A4M9  0.78775867 -1.9593282 -0.12710197 -0.09958441  0.057197707
TCGA.J8.A3O1  1.78740479  0.6340798  1.86648724 -0.09958441  3.025901663
TCGA.ET.A39K -1.65625689  0.5224053 -1.99440254 -0.09958441 -1.763368412
TCGA.4C.A93U  0.51671591  1.5323592  0.93936016 -0.09958441  0.947116194
TCGA.ET.A25I -0.61594986 -0.7675800  0.05719771 -0.09958441 -0.460695126
TCGA.ET.A2MZ  0.02237138  0.5338351  0.33157761 -0.09958441  0.499746840
TCGA.EM.A3AK -0.13713116  1.7181407  0.87923079 -0.09958441  0.002485504
TCGA.FK.A4UB -1.74030965 -0.6647705 -0.71522984 -0.09958441 -0.347375322
TCGA.BJ.A28X  1.36253454  1.4283268 -0.30022446 -0.09958441  0.556907718

In the vignette, you say The expression values are coded as continuous numerical values which are comparable between samples. So how do I convert my raw counts to continuous values (Do you suggest using normalised reads from DESEq2)? Besides, how do I handle biological replicates.

Paired samples analysis

Dear authors,

I am grateful for the package and how easy it is to use. I would like to analyse a dataset with paired samples (repeated measures). Can I use the function as-is, i.e. comparing baseline and post-treatment data?

Thank you for your time.

Regards,
Mikhael

Variance and C-, S-, D- scores not showing

Hello,

So I am an undergraduate, so it is possible that my errors are easily an ignorance issue, but I am trying to analyze RNA-seq data through your data. I work in a lab looking for gravitropic genes in the Arabidopsis Thaliana model. We have a dataset of previously examined RNA-seq data that I am trying to run through the code in R but the results I have found are confusing to say the least. When bootstrapped at 10, we have no variance and the C- and D- values cap out at "infinity". When bootstrapped at 100, we get no C-, S-, or D- scores with only numbers showing in the Rho2 and var2.

I am running the analysis on my laptop (16Gb), but despite a longer wait it still runs just fine. Our data also only has 4 samples per treatment and tissue, so the analysis is >27,000 genes but only 4 samples. Would either of these relate to the issues we are finding or could you offer any more advice for this issue?

Thank you,

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.