paulingliu / rogue Goto Github PK
View Code? Open in Web Editor NEWAssessing the purity of single cell population
License: BSD 3-Clause "New" or "Revised" License
Assessing the purity of single cell population
License: BSD 3-Clause "New" or "Revised" License
Hello,
I have a Smart-seq2 scRNA-seq dataset with multiple samples.
However, I just hope to check whether the resolution parameter is suitable or not for the whole dataset rather than each sample. So I use orig.ident (all are same) in "samples" parameter.
I use rogue as follow:
rogue(seurat@assays$RNA@data, labels = seurat$seurat_clusters, samples = seurat$orig.ident, platform = "full-length", span = 0.5)
Is it correct ?
Hi, a good tools for scRNA-seq data analysis! I wonder that whether we can get ROGUE value for multi cell clusters from a same sample via a simple function like the rogue()
function for multi sample data?
Hi there,
I'm constantly using ROGUE to analyze my data following the tutorial available in the vignettes, but I had this error with some datasets when I run rogue():
Error in predLoess(object$y, object$x, newx = if (is.null(newdata)) object$x else if (is.data.frame(newdata)) as.matrix(model.frame(delete.response(terms(object)), : NA/NaN/Inf in foreign function call (arg 5)
I tried to really deep dive into your code to see if I could debug cause initially, I thought that my data had some problem (like a NA value). But after some time, I think that's not the case. 😔
What I was able to do was to isolate the problem: I found that the error happens when a pass the data into the SE_fun() - that's is inside rogue() - in line 348, file /R/ROGUE.R
I'm attaching the 'expr' in the file expr.txt
so you can try to reproduce using the following parameters:
ROGUE:::SE_fun(new_exp, span=0.6, r =1)
ROGUE has been absolutely important to my analysis, so I'm hoping we can find what's going on 😅
Thanks in advance!
Hi,
what is the best way to run SE_fun in the case of multiple samples/batches ?
do I run it on each sample, i.e. select genes with p.adj<0.05 (or similar) for each sample, than combine them all together or should I run it on then combined count matrix of all samples, i.e. assuming SE_fun is not affected by batch ?
Thanks
Hello @PaulingLiu !
Thank you very much for your useful tool.
Could we apply this entropy-based metric for assessing the purity using data from scATAC-seq?
I am confused about "cluster" here, I think the "cluster" may should be "clusters" ?
Hello
I got the following error after trying to run rogue.
> rogue.res <- rogue(expr, labels = meta$leiden_0.8, samples = meta$Treatment, platform = "UMI", span = 0.6)
Error in predLoess(object$y, object$x, newx = if (is.null(newdata)) object$x else if (is.data.frame(newdata)) as.matrix(model.frame(delete.response(terms(object)), :
NA/NaN/Inf in foreign function call (arg 5)
In the context, each sample is one treatment.
This is what my dataframes look like
> expr[1:5,1:5]
V1 V2 V3 V4 V5
Rp1 0 0 0 0 0
Sox17 0 0 0 0 0
St18 0 0 0 0 0
Mybl1 0 0 0 0 0
Sulf1 0 0 0 0 1
> meta[1:5,1:5]
orig.ident nCount_RNA nFeature_RNA percent.mt timpoint
1 Benitez 318 278 0.9433962 NA
2 Benitez 811 580 0.2466091 NA
3 Benitez 476 392 0.6302521 NA
4 Benitez 302 268 0.3311258 NA
5 Benitez 645 485 0.3100775 NA
thanks for building so nice tools.
I got a problem about the result interpretation. I know a ROGUE value of 1, indicating it is a completely pure subtype or state. In contrast, a population with maximum summarization of significant ds will yield a purity score of ~0.
but There're several NA in the result, how do we interpret the NA value?
appreciate your response.
尊敬的刘博士您好,
我想用您创建的ROGUE,但我的单细胞数据很大(大于2万个基因*大于80w个细胞),我不知道ROGUE是否能够运用于这样大的数据,并有其他几个问题想咨询您。
请问从算法和数学的角度来看,ROGUE能够使用在这样大的数据上嘛?
同时,在以往使用者的问题中,您提到使用ROGUE的时候可能需要调高span,请问这个指调高的程度有什么范围要求或者有什么规律吗?
如果当数据很大的时候,我能对每个cluster进行抽样取一小群来进行计算吗?
谢谢您的帮助,祝好
阮昭慧
Hi,
Thank you fro the package. I was able to run this on my integrated dataset to see how pure each of the clusters are but when I tried to run the same thing with two very different clusters (tumor and T-cells), the rogue value was still very high( 0.99) which clearly doesn't make sense. Is there anything I missed or did wrong?
Thank you!
I encountered an error when running this line of code, rogue.res <- rogue(expr, labels = meta$Tcluster, samples = meta$source, platform = "UMI", span = 0.6)
Error in predLoess(object$y, object$x, newx = if (is.null(newdata)) object$x else if (is.data.frame(newdata)) as.matrix(model.frame(delete.response(terms(object)), :
NA/NaN/Inf in foreign function call (arg 5)
In addition: There were 50 or more warnings (use warnings() to see the first 50)
Through tracing, I found that there was a problem during the calculation of prd <- predict(fit, .x$mean.expr) in SE_fun(), specifically in the ROGUE::entropy_fit step.
Here's what my data looks like, and I'm not quite sure where the problem is occurring. Can you provide some insights.
fit
Call:
loess(formula = entropy ~ mean.expr, data = tmp, span = span)
Number of Observations: 3379
Equivalent Number of Parameters: NaN
Residual Standard Error: NaN
.x$mean.expr
NOC2L ISG15 C1orf159 TNFRSF4 SDF4 B3GALT6 PUSL1
0.08004271 0.22314355 0.08004271 0.08004271 0.08004271 0.08004271 0.08004271
AURKAIP1 CCNL2 MRPL20 SSU72 MIB2 GNB1 FAAP20
0.08004271 0.08004271 0.28768207 0.08004271 0.08004271 0.08004271 0.15415068
RER1 TNFRSF14 RPL22 TNFRSF25 CAMTA1 VAMP3 UTS2 prd <- predict(fit, .x$mean.expr)
Error in predLoess(object$y, object$x, newx = if (is.null(newdata)) object$x else if (is.data.frame(newdata)) as.matrix(model.frame(delete.response(terms(object)), :
NA/NaN/Inf in foreign function call (arg 5)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.