japrin / sscclust Goto Github PK

simpler single cell RNAseq data clustering

License: GNU General Public License v3.0

R 100.00%

clustering bioinformatics genomics immunology single-cell

sscclust's Introduction

sscClust

simpler single cell RNAseq data clustering (sscClust), is a package implement mutilple functionalities which are basic procedures in single cell RNAseq data analysis, including variable genes identification, dimension reduction, clustering on reduced data. Also some new ideas were proposed, such as projecting data to a feature space using spearman correlation which make visualiaztion and clustering more efficient, clustering with subsampling and classfification which make it feasible to process thousands cells' data.

Installation

Calculation will become expensive when the dataset become large, so we stongly recommand use R which linked to optimized BLAS library, such as ATLAS, MKL etc. For windows users, Microsoft R open is recommanded, and for Unix-alike users, please refer this for how to compile R using external BLAS library.

Package sscVis is required.

To install this package, simply:

install.packages("devtools")
devtools::install_github("Japrin/sscVis")
devtools::install_github("Japrin/sscClust")

Example

Run the clustering pipeline for clustering using all data:

data("sce.Pollen")
sce.all <- ssc.run(sce.Pollen, subsampling=F, k.batch=11)

for clustering with subsampling:

data("sce.Pollen")
sce.sub <- ssc.run(sce.Pollen, subsampling=T, sub.frac = 0.8, k.batch=11)

More information can be found in the vignettes:

clustring by various methods, see this vignette
clustering using spearman correlation and subsampling, see this vignette.

sscclust's People

Contributors

Stargazers

Watchers

Forkers

dongjt0727 hrk2109 silenwang chitrita jaymgrayson jianye0383 mengchengyao akhileshkaushal bioming zorrodong chenwenchang chenmengpin jajcobyang yuanjingnan xuuchen 5tinzi monstersuser huizhong1993

sscclust's Issues

Vignette Code ssc.run argument method.vgene: "sd" does not work

Hi Japrin

When running your vignette, in the block #All in One: sce.Pollen <- ssc.run(sce.Pollen,method.vgene = "sd",sd.n = 1500,method.reduction = "pca",method.clust = "kmeans", k.batch=11,seed = 9997) yields the error message Error in ssc.reduceDim(obj, assay.name = assay.name, method = method.reduction, : No variable genes identified by method sd !

It could be solved by rewriting the argument value for method.vgene from "sd" to "HVG.sd", so running
sce.Pollen <- ssc.run(sce.Pollen,method.vgene = "HVG.sd",sd.n = 1500,method.reduction = "pca",method.clust = "kmeans", k.batch=11,seed = 9997)

which then works: this makes sense, as in the invoked ssc.variableGene method, the arguments can be one of "HVG.sd", "HVG.mean.sd", "HVG.trendVar" (but not "sd").

Is this the correct way of running this vignette?

Kind regards

Confusion for density based clustering method

Hello, Japrin
According to the clustering method document, run dimension reduction using both PCA and tSNE method, and then density-based clustering on the tSNE map should use a script like:

sce.Pollen <- ssc.run(sce.Pollen,method.vgene = "sd",sd.n = 1500,method.reduction = "pca",
                      method.clust = "dpclust", 
                      parfile = system.file("extdata/Pollen.par.r",package = "sscClust"),
                      out.prefix = "./Pollen.dpclust", seed=9997)

But the argument method.reduction used here is "pca", not "tsne", so I'm confused about the actual method used here.
Which one of the following is right?

1. PCA -> tSNE -> use tSNE data to perform density clustering
2. PCA -> use PCA data to perform density clustering -> Mapping the result of clustering to a tSNE Map

By the way, the argument seed seems to make a diffrence in the clustering result, do you have any recommendation for this?

How to display two groups of cells in one t-SNE map

Hi Japrin

Thanks again for yesterday's reply, and now a new question come across:
There was a t-SNE map(section b) displaying cell clustering result in a paper correlating to this package:

It is described that the clustering analysis was processed separately for CD8+ and CD4+ cells, so I'm wondering how to put things together, since I didn't found any function to do similar things in this package.

Looking forward to your reply,
Thanks a lot!

perform "str(metadata(sce.Pollen)$ssc$variable.gene$sd) " got "NULL"

Hi, Japrini
I am a fresh man of R language . And i performed the codes following your guide, but i got
NULL instead of chr [1:1500] "G5596" "G945" "G244" "G496" "G3558" "G5598" "G1909..." when i visualize the variable gene selection in part:

sce.Pollen <- ssc.variableGene(sce.Pollen,method="sd",sd.n=1500) str(metadata(sce.Pollen)$ssc$variable.gene$sd)

Could you give me some suggestions？ Thanka a lot!

R package 'Error in t.default(W) : argument is not a matrix'

Hi there,

When I try to run:

sce_10x<- ssc.reduceDim(sce_10x,method="pca",seed = 9997, assay.name="counts")

I get the error above. Could you help fixing it?

Thanks!

japrin / sscclust Goto Github PK

sscclust's Introduction

sscClust

Installation

Example

sscclust's People

Contributors

Stargazers

Watchers

Forkers

sscclust's Issues

Vignette Code ssc.run argument method.vgene: "sd" does not work

Confusion for density based clustering method

How to display two groups of cells in one t-SNE map

perform "str(metadata(sce.Pollen)$ssc$variable.gene$sd) " got "NULL"

R package 'Error in t.default(W) : argument is not a matrix'

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent