Giter Site home page Giter Site logo

kharchenkolab / conos Goto Github PK

View Code? Open in Web Editor NEW
188.0 13.0 38.0 275.05 MB

R package for the joint analysis of multiple single-cell RNA-seq datasets

License: GNU General Public License v3.0

R 69.13% C++ 30.43% Dockerfile 0.43%
scrna-seq single-cell-rna-seq batch-correction

conos's Introduction

<kharchenkolab> CRAN status CRAN downloads

conos

Conos: Clustering On Network Of Samples

  • What is conos? Conos is an R package to wire together large collections of single-cell RNA-seq datasets, which allows for both the identification of recurrent cell clusters and the propagation of information between datasets in multi-sample or atlas-scale collections. It focuses on the uniform mapping of homologous cell types across heterogeneous sample collections. For instance, users could investigate a collection of dozens of peripheral blood samples from cancer patients combined with dozens of controls, which perhaps includes samples of a related tissue such as lymph nodes.

  • How does it work? overview Conos applies one of many error-prone methods to align each pair of samples in a collection, establishing weighted inter-sample cell-to-cell links. The resulting joint graph can then be analyzed to identify subpopulations across different samples. Cells of the same type will tend to map to each other across many such pairwise comparisons, forming cliques that can be recognized as clusters (graph communities).

    Conos processing can be divided into three phases:

    • Phase 1: Filtering and normalization Each individual dataset in the sample panel is filtered and normalized using standard packages for single-dataset processing: either pagoda2 or Seurat. Specifically, Conos relies on these methods to perform cell filtering, library size normalization, identification of overdispersed genes and, in the case of pagoda2, variance normalization. (Conos is robust to variations in the normalization procedures, but it is recommended that all of the datasets be processed uniformly.)
    • Phase 2: Identify multiple plausible inter-sample mappings Conos performs pairwise comparisons of the datasets in the panel to establish an initial error-prone mapping between cells of different datasets.
    • Phase 3: Joint graph construction These inter-sample edges from Phase 2 are then combined with lower-weight intra-sample edges during the joint graph construction. The joint graph is then used for downstream analysis, including community detection and label propagation. For a comprehensive description of the algorithm, please refer to our publication.
  • What does it produce? In essence, conos will take a large, potentially heterogeneous panel of samples and will produce clustering grouping similar cell subpopulations together in a way that will be robust to inter-sample variation:
    example

  • What are the advantages over existing alignment methods? Conos is robust to heterogeneity of samples within a collection, as well as noise. The ability to resolve finer subpopulation structure improves as the size of the panel increases.

Basics of using conos

Given a list of individual processed samples (pl), conos processing can be as simple as this:

# Construct Conos object, where pl is a list of pagoda2 objects 
con <- Conos$new(pl)

# Build joint graph
con$buildGraph()

# Find communities
con$findCommunities()

# Generate embedding
con$embedGraph()

# Plot joint graph
con$plotGraph()

# Plot panel with joint clustering results
con$plotPanel()

To see more documentation on the class Conos, run ?Conos.

Tutorials

Please see the following tutorials for detailed examples of how to use conos:

Conos walkthrough:

Adjustment of alignment strength with conos:

Integration with Scanpy:

Note that for integration with Scanpy, users need to save conos files to disk from an R session, and then load these files into Python.

Save conos for Scanpy:

Load conos files into Scanpy:

Integrating RNA-seq and ATAC-seq with conos:

Running RNA velocity on a Conos object

First of all, in order to obtain an RNA velocity plot from a Conos object you have to use the dropEst pipeline to align and annotate your single-cell RNA-seq measurements. You can see this tutorial and this shell script to see how it can be done. In this example we specifically assume that when running dropEst you have used the -V option to get estimates of unspliced/spliced counts from the dropEst directly. Secondly, you need the velocyto.R package for the actual velocity estimation and visualisation.

After running dropEst you should have 2 files for each of the samples:

  • sample.rds (matrix of counts)
  • sample.matrices.rds (3 matrices of exons, introns and spanning reads)

The .matrices.rds files are the velocity files. Load them into R in a list (same order as you give to conos). Load, preprocess and integrate with conos the count matrices (.rds) as you normally would. Before running the velocity, you must at least create an embedding and run the leiden clustering. Finally, you can estimate the velocity as follows:

### Assuming con is your Conos object and cms.list is the list of your velocity files ###

library(velocyto.R)

# Preprocess the velocity files to match the Conos object
vi <- velocityInfoConos(cms.list = cms.list, con = con, 
                        n.odgenes = 2e3, verbose = TRUE)

# Estimate RNA velocity
vel.info <- vi %$%
  gene.relative.velocity.estimates(emat, nmat, cell.dist = cell.dist, 
                                   deltaT = 1, kCells = 25, fit.quantile = 0.05, n.cores = 4)

# Visualise the velocity on your Conos embedding 
# Takes a very long time! 
# Assign to a variable to speed up subsequent recalculations
cc.velo <- show.velocity.on.embedding.cor(vi$emb, vel.info, n = 200, scale = 'sqrt', 
                                          cell.colors = ac(vi$cell.colors, alpha = 0.5), 
                                          cex = 0.8, grid.n = 50, cell.border.alpha = 0,
                                          arrow.scale = 3, arrow.lwd = 0.6, n.cores = 4, 
                                          xlab = "UMAP1", ylab = "UMAP2")

# Use cc=cc.velo$cc when running again (skips the most time consuming delta projections step)
show.velocity.on.embedding.cor(vi$emb, vel.info, cc = cc.velo$cc, n = 200, scale = 'sqrt', 
                               cell.colors = ac(vi$cell.colors, alpha = 0.5), 
                               cex = 0.8, arrow.scale = 15, show.grid.flow = TRUE, 
                               min.grid.cell.mass = 0.5, grid.n = 40, arrow.lwd = 2,
                               do.par = F, cell.border.alpha = 0.1, n.cores = 4,
                               xlab = "UMAP1", ylab = "UMAP2")

Installation

To install the stable version from CRAN, use:

install.packages('conos')

To install the latest version of conos, use:

install.packages('devtools')
devtools::install_github('kharchenkolab/conos')

System dependencies

The dependencies are inherited from pagoda2. Note that this package also has the dependency igraph, which requires various libraries to install correctly. Please see the installation instructions at that page for more details, along with the github README here.

Ubuntu dependencies

To install system dependencies using apt-get, use the following:

sudo apt-get update
sudo apt-get -y install libcurl4-openssl-dev libssl-dev libxml2-dev libgmp-dev libglpk-dev
Red Hat-based distributions dependencies

For Red Hat distributions using yum, use the following command:

sudo yum update
sudo yum install openssl-devel libcurl-devel libxml2-devel gmp-devel glpk-devel
Mac OS

Using the Mac OS package manager Homebrew, try the following command:

brew update
brew install openssl curl-openssl libxml2 glpk gmp

(You may need to run brew uninstall curl in order for brew install curl-openssl to be successful.)

As of version 1.3.1, conos should successfully install on Mac OS. However, if there are issues, please refer to the following wiki page for further instructions on installing conos with Mac OS: Installing conos for Mac OS

Running conos via Docker

If your system configuration is making it difficult to install conos natively, an alternative way to get conos running is through a docker container.

Note: On Mac OS X, Docker Machine has Memory and CPU limits. To control it, please check instructions either for CLI or for Docker Desktop.

Ready-to-run Docker image

The docker distribution has the latest version and also includes the pagoda2 package. To start a docker container, first install docker on your platform and then start the pagoda2 container with the following command in the shell:

docker run -p 8787:8787 -e PASSWORD=pass pkharchenkolab/conos:latest

The first time you run this command, it will download several large images so make sure that you have fast internet access setup. You can then point your browser to http://localhost:8787/ to get an Rstudio environment with pagoda2 and conos installed (please log in using credentials username=rstudio, password=pass). Explore the docker --mount option to allow access of the docker image to your local files.

Note: If you already downloaded the docker image and want to update it, please pull the latest image with:

docker pull pkharchenkolab/conos:latest

Building Docker image from the Dockerfile

If you want to build image by your own, download the Dockerfile (available in this repo under /docker) and run to following command to build it:

docker build -t conos .

This will create a "conos" docker image on your system (please be patient, as the build could take approximately 30-50 minutes to finish). You can then run it using the following command:

docker run -d -p 8787:8787 -e PASSWORD=pass --name conos -it conos

References

If you find this software useful for your research, please cite the corresponding paper:

Barkas N., Petukhov V., Nikolaeva D., Lozinsky Y., Demharter S., Khodosevich K., & Kharchenko P.V. 
Joint analysis of heterogeneous single-cell RNA-seq dataset collections. 
Nature Methods, (2019). doi:10.1038/s41592-019-0466-z

The R package can be cited as:

Viktor Petukhov, Nikolas Barkas, Peter Kharchenko, and Evan
Biederstedt (2021). conos: Clustering on Network of Samples. R
package version 1.5.2.

conos's People

Contributors

barkasn avatar evanbiederstedt avatar gmaciag avatar kant avatar mojaveazure avatar pkharchenko avatar rrydbirk avatar vpetukhov avatar yarloz-old avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

conos's Issues

can't reproduce the figure in tutorial

Thank you in advance if you can help.

I am trying to walk myself through the example in tutorial, step by step, in the docker image. However, after
con$buildGraph(k=15, k.self=5, space='PCA', ncomps=30, n.odgenes=2000, matching.method='mNN', metric='angular', score.component.variance=TRUE, verbose=TRUE)
con$findCommunities(method=leiden.community, resolution=1)
con$plotPanel(font.size=4)

I got a figure like this
image
The clusters across datasets did not seem to couple.

My question is whether the parameters in the notebook are supposed to generate the figures or other values of parameters should be tried?

During the buildGraph step, I got a warning:
image

error with single cell summary object

I have a list with SingleCellExperiment objects List of 3

$ :Formal class 'SingleCellExperiment' [package "SingleCellExperiment"] with 10 slots
$ :Formal class 'SingleCellExperiment' [package "SingleCellExperiment"] with 10 slots
$ :Formal class 'SingleCellExperiment' [package "SingleCellExperiment"] with 10 slots
I was trying to use this to analyze using Conos and came up with following error.

Error in .Object$initialize(...) : x is not of class dgCMatrix or matrix
I think I need to convert SingleCellExperiment to dgCMatrix class? Please let me know if somebody has any suggestions in this regard. Thanks

can't get correctGenes

Thanks for developing this wonderful package for batch correction of single cell studies. I am trying to obtain the batch corrected count matrix from the example given in the tutorial. After running the following code, I get an error which I could not figure out how to solve. Thanks for your help!

panel <- readRDS(file.path(find.package('conos'),'extdata','panel.rds'))
tutorial <- lapply(panel, basicSeuratProc)
conT <- Conos$new(tutorial, n.cores=4)
conT$buildGraph(k=15, k.self=5, space='PCA', ncomps=30, n.odgenes=2000, matching.method='mNN', metric='angular', score.component.variance=TRUE, verbose=TRUE)
conT$findCommunities(method=leiden.community, resolution=1)
conT$correctGenes()
Error in s$misc : $ operator not defined for this S4 class

Hide specific dataset(s) from embedded graphs

Is there any method to, after building the sample embeddings and clustering, at the level of data display hide specific samples from the output visualizations without altering the graph structure?

This could be useful for both data presentation purposes, but also for interrogating sub-cluster level per-sample expression variances.

Long runtime in buildGraph with score.component.variance=T

This takes 10+ min to run. Is it necessary to include score.component.variance or can it be excluded?

Code:
con$buildGraph(k=15, k.self=5, space='PCA', ncomps=30, n.odgenes=2000, matching.method='mNN', metric='angular', verbose=TRUE, score.component.variance=T)

Object:

lapply(con$samples,function(x) str(x$counts))
Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
..@ i : int [1:14427418] 13 58 96 117 119 134 141 148 180 198 ...
..@ p : int [1:18657] 0 317 1322 1363 1686 1717 1749 1804 2204 3250 ...
..@ Dim : int [1:2] 6255 18656
..@ Dimnames:List of 2
.. ..$ : chr [1:6255] "S1_AAACCCAAGATCGGTG" "S1_AAACCCAAGGCAGGGA" "S1_AAACCCACAAATAGCA" "S1_AAACCCACATGGAATA" ...
.. ..$ : chr [1:18656] "AL627309.1" "AL669831.5" "LINC00115" "NOC2L" ...
..@ x : Named num [1:14427418] 0.0471 0.1322 0.0632 0.0369 0.0526 ...
.. ..- attr(, "names")= chr [1:14427418] "S1_AAACGAAGTGTGGTCC" "S1_AAAGTCCTCGGCCAAC" "S1_AACAGGGAGACATATG" "S1_AACCATGCAGAGAAAG" ...
..@ factors : list()
Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
..@ i : int [1:16225403] 5 28 63 75 99 118 130 142 175 220 ...
..@ p : int [1:18205] 0 389 1510 1589 1653 2184 2259 2366 3172 4162 ...
..@ Dim : int [1:2] 6470 18204
..@ Dimnames:List of 2
.. ..$ : chr [1:6470] "S2_AAACCCAAGATGGCAC" "S2_AAACCCAAGTGCGCTC" "S2_AAACCCACAACCAGAG" "ctrl_039_AAACCCACACTGCGTG" ...
.. ..$ : chr [1:18204] "AL627309.1" "AL669831.5" "LINC00115" "SAMD11" ...
..@ x : Named num [1:16225403] 0.2572 0.0987 0.0631 0.1405 0.0882 ...
.. ..- attr(
, "names")= chr [1:16225403] "S2_AAACCCAGTGAAGCTG" "S2_AAAGAACGTTTGATCG" "S2_AAAGTCCTCGCCATAA" "S2_AAATGGAGTATACGGG" ...
..@ factors : list()
Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
..@ i : int [1:9283290] 16 18 23 26 50 62 65 75 100 109 ...
..@ p : int [1:17626] 0 271 313 1093 1169 1234 1549 1603 1698 1797 ...
..@ Dim : int [1:2] 3500 17625
..@ Dimnames:List of 2
.. ..$ : chr [1:3500] "S3_AAACCCACACTACACA" "S3_AAACCCAGTACTAAGA" "S3_AAACCCATCGTTTACT" "S3_AAACCCATCTTGGAAC" ...
.. ..$ : chr [1:17625] "AL627309.1" "AC114498.1" "AL669831.5" "LINC00115" ...
..@ x : Named num [1:9283290] 0.1539 0.039 0.0366 0.2743 0.0311 ...
.. ..- attr(*, "names")= chr [1:9283290] "S3_AAAGGATCACGCAAAG" "S3_AAAGGATGTAGCTCGC" "S3_AAAGGGCTCGGTTGTA" "S3_AAAGGTACACGCACCA" ...
..@ factors : list()
$S1
NULL

$S2
NULL

$S3
NULL

sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /cm/shared/apps/intel/parallel_studio_xe/2018_update2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64_lin/libmkl_gf_lp64.so

locale:
[1] C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] conos_0.0.0.9002 igraph_1.2.4 Matrix_1.2-15 RLinuxModules_0.2

loaded via a namespace (and not attached):
[1] mclust_5.4.3 Rcpp_1.0.1 mvtnorm_1.0-10
[4] lattice_0.20-38 GO.db_3.7.0 class_7.3-14
[7] assertthat_0.2.1 digest_0.6.18 mime_0.6
[10] R6_2.4.0 plyr_1.8.4 stats4_3.5.0
[13] RSQLite_2.1.1 ggplot2_3.1.1 pillar_1.3.1
[16] rlang_0.3.4 lazyeval_0.2.2 diptest_0.75-7
[19] irlba_2.3.3 whisker_0.3-2 blob_1.1.1
[22] kernlab_0.9-27 S4Vectors_0.20.1 urltools_1.7.2
[25] triebeard_0.3.0 bit_1.1-14 munsell_0.5.0
[28] shiny_1.3.0 compiler_3.5.0 httpuv_1.5.1
[31] pkgconfig_2.0.2 BiocGenerics_0.28.0 base64enc_0.1-3
[34] pcaMethods_1.74.0 htmltools_0.3.6 nnet_7.3-12
[37] tidyselect_0.2.5 tibble_2.1.1 gridExtra_2.3
[40] pagoda2_0.0.0.9002 IRanges_2.16.0 dendextend_1.10.0
[43] viridisLite_0.3.0 crayon_1.3.4 dplyr_0.8.0.1
[46] later_0.8.0 MASS_7.3-51.1 grid_3.5.0
[49] xtable_1.8-3 gtable_0.3.0 DBI_1.0.0
[52] magrittr_1.5 scales_1.0.0 dendsort_0.3.3
[55] viridis_0.5.1 promises_1.0.1 flexmix_2.3-15
[58] robustbase_0.93-4 brew_1.0-6 rjson_0.2.20
[61] tools_3.5.0 fpc_2.1-11.1 bit64_0.9-7
[64] Biobase_2.42.0 glue_1.3.1 trimcluster_0.1-2.1
[67] DEoptimR_1.0-8 purrr_0.3.2 Rook_1.1-1
[70] parallel_3.5.0 AnnotationDbi_1.44.0 colorspace_1.4-1
[73] cluster_2.0.7-1 prabclus_2.2-7 memoise_1.1.0
[76] modeltools_0.2-22

Failure to install conos on MacOS-Mojave

Hi folks,

I am trying to install conos on the new MacOS-Mojave, but get the following error:

clang-4.0: error: no such file or directory: '/Library/Frameworks/R.framework/Versions/3.5/Resources/library/igraph/libs/igraph.dylib'
make: *** [conos.so] Error 1
ERROR: compilation failed for package ‘conos’

igraph is installed following pagoda2 tutorial and installation of pagoda2 is also successful. I managed to install conos on an older version of macos (Sierra), so I wonder if this is related to this newer version of macos. Please help me fix this error.

Many thanks,
Li

Implement saving all files for ScanPy into a one HDF5 file instead

Thanks! It's not urgent, but would be amazing if you'll add it at some point. Anyway, I'll merge this PR without h5.
For R there are two libs: rhdf5 and hdf5r. I'd prefer the later, as it's on CRAN, but honestly to my experience rhdf5 was easier to use. So it's up to you. Just ensure that it's in Suggests, not in Imports.

Originally posted by @VPetukhov in #45

We are saving all the output from the saveConosForScanPy function into separate files. We should instead save it all into a single HDF5 style file and update the tutorial for loading into ScanPy to reflect the changes.

@VPetukhov could you change the status of this issue into enhancement? I don't seem to have that possibility.

getRawCountMatrix for Seurat should return values only for cells actually used in the analysis

I encountered this issue while trying to run getPerCellTypeDE() on my dataset, but I'm assuming the effect of it could be more widespread, since it is caused by how conos access the raw count matrix for Seurat objects.

getRawCountMatrix() for a Seurat sample returns an object stored in [email protected]. This is where I (and I assume other Seurat users) store the expression matrix for all the cells, before any filtering (for UMI count, mitochondrial content etc.). However, my analysis in conos is done only on the cells stored in sample@data - that is only the cells that passed the quality filtering step. Therefore, for a given sample, the number of cells in [email protected] and sample@data are different.

This discrepancy caused an error for me when trying to run getPerCellTypeDE(). That function calls rawMatricesWithCommonGenes() that uses getRawCountMatrix() to return a matrix that is then passed to collapseCellsByType(). collapseCellsByType() also uses the groups argument which the names of clusters for all the cells used in the analysis (so the ones stored in sample@data). The matrix and and the groups vector have different number of cells (rows), which ultimately produces an error at the aggr <- Matrix.utils::aggregate.Matrix(cm, g1) step (line 76).

I understand that conos my need the raw values for some of the functions (like DE), but then it should only take that values for the cells that are actually used in the rest of the analysis.

getPerCellTypeDE errors in DESeq2

When there are errors, the function just puts an NA on the slot. There are common errors derived from the fact that often some clusters are not present in all samples and hence, some comparisons are not possible.
A check for that should be included to allow possible comparisons while at least warning that some comparisons are not possible for that reason.

I have also noted the internal error ("some values in assay are not integers") from DESeqDataSet

installation error

Hey, trying to reinstall conos (newer version) on R 3.6 and have been getting the following error:

...
Note: wrong number of arguments to '*'
Note: wrong number of arguments to 'exp'
** help
Error : (converted from warning) /tmp/RtmptnDhVO/R.INSTALLf8f7647cfae2/conos/man/getClusterPrivacy.Rd:6: unexpected section header '\usage'
ERROR: installing Rd objects failed for package ‘conos’

buildGraph fails at local pairs step

Hey!

I have another issue with the buildGraphfunction.
It fails at the local pairs step with the following error:

RRuntimeError: Error in `$<-.data.frame`(`*tmp*`, "type", value = 0) : 
  replacement has 1 row, data has 0
Calls: <Anonymous> ... <Anonymous> -> getLocalEdges -> $<- -> $<-.data.frame

Visualise conos embedding in scanpy

This is a suggestion on updating the scanpy integration tutorial.

Since we are loading the conos embedding into scanpy anyway, it would be nice to be able to make a plot of that exact embedding in scanpy as well (keep consistent with the style of figures etc). That can be easily done if we add following line when creating the scanpy object:

adata.obsm['X_pca'] = pca_df.values

Then we can plot the embedding with:

sc.pl.pca(adata, color='louvain')

Scanpy integration fails during analysis for scanpy > 1.4.1

I've been trying to use the two new vignettes you have provided for integrating conos with scanpy. However, when following your ipython notebook I get an error when trying to run sc.tl.umap(adata). The error is:

File "/Users/.../anaconda/envs/scanpy-paga/lib/python3.6/site-packages/scanpy/tools/_umap.py", line 123, in umap
    neigh_params = adata.uns['neighbors']['params']

KeyError: 'params' 

adata.uns['neighbors'] is a dictionary that we create one line earlier and it only contains keys connectivities and distances. So, we're missing the key params. However, I don't know myself what value that key should hold.

Could you please help.

'test.stability' produces error in findCommunities

Is it adjustedRand from clues package that's not called correctly?

con$findCommunities(method = leiden.community, min.group.size = 15, test.stability = T, resolution=1)
running 100 subsampling iterations ... done
calculating flat stability stats ... adjusted Rand ... Error in conos:::papply(sr, function(o) { :
Errors in papply: Error in adjustedRand(as.integer(ol), as.integer(cls.groups[names(ol)]), :
could not find function "adjustedRand"
Errors in papply: Error in adjustedRand(as.integer(ol), as.integer(cls.groups[names(ol)]), :
could not find function "adjustedRand"
Errors in papply: Error in adjustedRand(as.integer(ol), as.integer(cls.groups[names(ol)]), :
could not find function "adjustedRand"
Errors in papply: Error in adjustedRand(as.integer(ol), as.integer(cls.groups[names(ol)]), :
could not find function "adjustedRand"
Errors in papply: Error in adjustedRand(as.integer(ol), as.integer(cls.groups[names(ol)]), :
could not find function "adjustedRand"
Errors in papply: Error in adjustedRand(as.integer(ol), as.integer(cls.groups[names(ol)]), :
could not find function "adjustedRand"
Errors in papply: Error in adjustedRand(as.integer(ol), as.integer(cls.groups[names(ol)]), :
could not find function "adjustedRand"
Errors in papply: Error in adjustedRan
In addition: Warning message:
In mclapply(..., mc.cores = n.cores, mc.preschedule = mc.preschedule) :
100 function calls resulted in an error

sessionInfo()
R version 3.5.3 (2019-03-11)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.2 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3.10.3
LAPACK: /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3.10.3

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base

other attached packages:
[1] ggplot2_3.2.1 conos_1.1.2 pagoda2_0.1.0 igraph_1.2.4.1 Matrix_1.2-17 magrittr_1.5

loaded via a namespace (and not attached):
[1] Rcpp_1.0.2 RSpectra_0.15-0 later_1.0.0 compiler_3.5.3 pillar_1.4.2 base64enc_0.1-3 viridis_0.5.1
[8] tools_3.5.3 uwot_0.1.4 dendextend_1.12.0 digest_0.6.22 tibble_2.1.3 gtable_0.3.0 lattice_0.20-38
[15] viridisLite_0.3.0 pkgconfig_2.0.3 rlang_0.4.0 shiny_1.4.0 rstudioapi_0.10 fastmap_1.0.1 gridExtra_2.3
[22] withr_2.1.2 dplyr_0.8.3 triebeard_0.3.0 grid_3.5.3 tidyselect_0.2.5 glue_1.3.1 R6_2.4.0
[29] Rook_1.1-1 irlba_2.3.3 purrr_0.3.3 promises_1.1.0 htmltools_0.4.0 scales_1.0.0 urltools_1.7.3
[36] MASS_7.3-51.3 assertthat_0.2.1 xtable_1.8-4 mime_0.7 colorspace_1.4-1 httpuv_1.5.2 brew_1.0-6
[43] RcppParallel_4.4.4 lazyeval_0.2.2 munsell_0.5.0 crayon_1.3.4 rjson_0.2.20

Can't update conos

remotes::update_packages("conos")
#> conos (528515681... -> 52d102cb0...) [GitHub]
#> Downloading GitHub repo hms-dbmi/conos@dev
#> Error: HTTP error 422.
#>   No commit found for SHA: rc-0.3.1
#>
#>   Rate limit remaining: 4867/5000
#>   Rate limit reset at: 2019-01-18 23:36:19 UTC
#>
#>

Created on 2019-01-18 by the reprex package (v0.2.1.9000)

devtools::install_github("hms-dbmi/conos", type = "source", INSTALL_opts = "--byte-compile")
#> Downloading GitHub repo hms-dbmi/conos@master
#> Error: HTTP error 422.
#>   No commit found for SHA: rc-0.3.1
#>
#>   Rate limit remaining: 4856/5000
#>   Rate limit reset at: 2019-01-18 23:36:19 UTC
#>
#>

Created on 2019-01-18 by the reprex package (v0.2.1.9000)

But for pagoda2 it's ok:

devtools::install_github("hms-dbmi/pagoda2", type = "source",
    INSTALL_opts = "--byte-compile")
#> Downloading GitHub repo hms-dbmi/pagoda2@master
#> Skipping 3 packages not available: AnnotationDbi, BiocGenerics, GO.db
#>
   checking for file/tmp/RtmpzJwvJt/remotes386816c66610/hms-dbmi-pagoda2-012c5e2/DESCRIPTION...checking for file/tmp/RtmpzJwvJt/remotes386816c66610/hms-dbmi-pagoda2-012c5e2/DESCRIPTION#>preparingpagoda2:
#>    checking DESCRIPTION meta-information ...checking DESCRIPTION meta-information
#> ─  cleaning src
#>checking for LF line-endings in source and make files and shell scripts
#>checking for empty or unneeded directories
#>buildingpagoda2_0.0.0.9002.tar.gz#>


#>

Created on 2019-01-18 by the reprex package (v0.2.1.9000)

require(pagoda2); sessioninfo::session_info()
#> Loading required package: pagoda2
#>
#> Warning: replacing previous import 'igraph::%>%' by 'magrittr::%>%' when
#> loading 'pagoda2'
#> ─ Session info ──────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 3.5.1 (2018-07-02)
#>  os       Ubuntu 18.04.1 LTS
#>  system   x86_64, linux-gnu
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       Etc/UTC
#>  date     2019-01-18
#>
#> ─ Packages ──────────────────────────────────────────────────────────────
#>  package       * version    date       lib
#>  AnnotationDbi   1.44.0     2018-10-30 [1]
#>  assertthat      0.2.0      2017-04-11 [1]
#>  base64enc       0.1-3      2015-07-28 [1]
#>  Biobase         2.42.0     2018-10-30 [1]
#>  BiocGenerics    0.28.0     2018-10-30 [1]
#>  bit             1.1-14     2018-05-29 [1]
#>  bit64           0.9-7      2017-05-08 [1]
#>  blob            1.1.1      2018-03-25 [1]
#>  brew            1.0-6      2011-04-13 [1]
#>  cli             1.0.1.9000 2018-12-23 [1]
#>  crayon          1.3.4      2017-09-16 [1]
#>  DBI             1.0.0      2018-05-02 [1]
#>  dendsort        0.3.3      2015-12-14 [1]
#>  digest          0.6.18     2018-10-10 [1]
#>  evaluate        0.12       2018-10-09 [1]
#>  GO.db           3.7.0      2018-12-23 [1]
#>  highr           0.7        2018-06-09 [1]
#>  htmltools       0.3.6      2017-04-28 [1]
#>  igraph          1.1.0      2018-12-23 [1]
#>  IRanges         2.16.0     2018-10-30 [1]
#>  irlba           2.3.3      2019-01-18 [1]
#>  knitr           1.21       2018-12-10 [1]
#>  lattice         0.20-38    2018-11-04 [1]
#>  magrittr        1.5        2014-11-22 [1]
#>  MASS            7.3-51.1   2018-11-01 [1]
#>  Matrix          1.2-15     2018-11-01 [1]
#>  memoise         1.1.0      2017-04-21 [1]
#>  pagoda2       * 0.0.0.9002 2019-01-18 [1]
#>  pcaMethods      1.74.0     2018-10-30 [1]
#>  pkgconfig       2.0.2      2018-08-16 [1]
#>  Rcpp            1.0.0      2018-11-07 [1]
#>  rjson           0.2.20     2018-06-08 [1]
#>  rmarkdown       1.11       2018-12-08 [1]
#>  Rook            1.1-1      2014-10-20 [1]
#>  RSQLite         2.1.1      2018-05-06 [1]
#>  S4Vectors       0.20.1     2018-11-09 [1]
#>  sessioninfo     1.1.1      2018-11-05 [1]
#>  stringi         1.2.4      2018-07-20 [1]
#>  stringr         1.3.1      2018-05-10 [1]
#>  triebeard       0.3.0      2016-08-04 [1]
#>  urltools        1.7.1      2018-08-03 [1]
#>  withr           2.1.2      2018-03-15 [1]
#>  xfun            0.4        2018-10-23 [1]
#>  yaml            2.2.0      2018-07-25 [1]
#>  source
#>  Bioconductor
#>  CRAN (R 3.5.1)
#>  CRAN (R 3.5.1)
#>  Bioconductor
#>  Bioconductor
#>  CRAN (R 3.5.1)
#>  CRAN (R 3.5.1)
#>  CRAN (R 3.5.1)
#>  CRAN (R 3.5.1)
#>  Github (r-lib/cli@56538e3)
#>  CRAN (R 3.5.1)
#>  CRAN (R 3.5.1)
#>  CRAN (R 3.5.1)
#>  CRAN (R 3.5.1)
#>  CRAN (R 3.5.1)
#>  Bioconductor
#>  CRAN (R 3.5.1)
#>  CRAN (R 3.5.1)
#>  Github (igraph/rigraph@057cc9d)
#>  Bioconductor
#>  Github (bwlewis/irlba@c2fa5a3)
#>  CRAN (R 3.5.1)
#>  CRAN (R 3.5.1)
#>  CRAN (R 3.5.1)
#>  CRAN (R 3.5.1)
#>  CRAN (R 3.5.1)
#>  CRAN (R 3.5.1)
#>  Github (hms-dbmi/pagoda2@012c5e2)
#>  Bioconductor
#>  CRAN (R 3.5.1)
#>  CRAN (R 3.5.1)
#>  CRAN (R 3.5.1)
#>  CRAN (R 3.5.1)
#>  CRAN (R 3.5.1)
#>  CRAN (R 3.5.1)
#>  Bioconductor
#>  CRAN (R 3.5.1)
#>  CRAN (R 3.5.1)
#>  CRAN (R 3.5.1)
#>  CRAN (R 3.5.1)
#>  CRAN (R 3.5.1)
#>  CRAN (R 3.5.1)
#>  CRAN (R 3.5.1)
#>  CRAN (R 3.5.1)
#>
#> [1] /usr/local/lib/R/library

Created on 2019-01-18 by the reprex package (v0.2.1.9000)

buildGraph fails with non-informative message if PCA is not provided

Hey,
I am currently having an issue when trying to build the neighbourhood graph.

Error in con$buildGraph(k = 10, k.self = 5, space = "PCA", ncomps = 50) : 
  insufficient number of comparable pairs

I think it might be related to doing the preprocessing in scanpy. Which preprocessing steps should be done before building the graph? Does the PCA need to be present before running the buildGraph function?

plotPanel(use.common.embedding=T) should fix the xlim/ylim values

Right now, the xlim/ylim values are determined based on the range of points in each sample, and can vary significantly between samples, making for a confusing result. In plotPanel() code it already deals with the plot_grid(), and I am not sure whether it can be adjusted based on this object to avoid going into plotSamples()

Error installing Conos: package or namespace load failed for ‘conos’ in dyn.load(file, DLLpath = DLLpath, ...

Hello I would love to use your package, however I am having some trouble downloading Conos to my Macbook. I have followed instructions for installing dependencies from your main github page, but am getting this error.

Error: package or namespace load failed for ‘conos’ in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/Library/Frameworks/R.framework/Versions/3.6/Resources/library/00LOCK-conos/00new/conos/libs/conos.so':
  dlopen(/Library/Frameworks/R.framework/Versions/3.6/Resources/library/00LOCK-conos/00new/conos/libs/conos.so, 6): Library not loaded: igraph.so
  Referenced from: /Library/Frameworks/R.framework/Versions/3.6/Resources/library/00LOCK-conos/00new/conos/libs/conos.so
  Reason: image not found
Error: loading failed
Execution halted
ERROR: loading failed
* removing ‘/Library/Frameworks/R.framework/Versions/3.6/Resources/library/conos’
Error: Failed to install 'conos' from GitHub:
  (converted from warning) installation of package ‘/var/folders/7w/f1vdsndj0_j094d6b1cy2x180000gn/T//Rtmp4NqQ2Y/file1625bbc7157/conos_1.2.1.tar.gz’ had non-zero exit status

I would appreciate any advice in downloading Conos properly. Thanks so much for the help!

Extract duplicated code from Conos and Pagoda2 to separate package

We have a lot of duplicated code in Pagoda and Conos:

  • local LargeVis
  • n2 library
  • ggplot embeddings, when I copy them to Pagoda
  • I'd also extend shiny app to be able to work with clusters in Pagoda
  • new Multilevel clustering with resolution parameter

Would be better to have it in a separate package.

Long runtime in embedGraph(method="UMAP")

This takes 20+ min to run.

Code:

con$embedGraph(method="UMAP")

Object:

lapply(con$samples,function(x) str(x$counts))
Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
..@ i : int [1:14427418] 13 58 96 117 119 134 141 148 180 198 ...
..@ p : int [1:18657] 0 317 1322 1363 1686 1717 1749 1804 2204 3250 ...
..@ Dim : int [1:2] 6255 18656
..@ Dimnames:List of 2
.. ..$ : chr [1:6255] "S1_AAACCCAAGATCGGTG" "S1_AAACCCAAGGCAGGGA" "S1_AAACCCACAAATAGCA" "S1_AAACCCACATGGAATA" ...
.. ..$ : chr [1:18656] "AL627309.1" "AL669831.5" "LINC00115" "NOC2L" ...
..@ x : Named num [1:14427418] 0.0471 0.1322 0.0632 0.0369 0.0526 ...
.. ..- attr(, "names")= chr [1:14427418] "S1_AAACGAAGTGTGGTCC" "S1_AAAGTCCTCGGCCAAC" "S1_AACAGGGAGACATATG" "S1_AACCATGCAGAGAAAG" ...
..@ factors : list()
Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
..@ i : int [1:16225403] 5 28 63 75 99 118 130 142 175 220 ...
..@ p : int [1:18205] 0 389 1510 1589 1653 2184 2259 2366 3172 4162 ...
..@ Dim : int [1:2] 6470 18204
..@ Dimnames:List of 2
.. ..$ : chr [1:6470] "S2_AAACCCAAGATGGCAC" "S2_AAACCCAAGTGCGCTC" "S2_AAACCCACAACCAGAG" "ctrl_039_AAACCCACACTGCGTG" ...
.. ..$ : chr [1:18204] "AL627309.1" "AL669831.5" "LINC00115" "SAMD11" ...
..@ x : Named num [1:16225403] 0.2572 0.0987 0.0631 0.1405 0.0882 ...
.. ..- attr(, "names")= chr [1:16225403] "S2_AAACCCAGTGAAGCTG" "S2_AAAGAACGTTTGATCG" "S2_AAAGTCCTCGCCATAA" "S2_AAATGGAGTATACGGG" ...
..@ factors : list()
Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
..@ i : int [1:9283290] 16 18 23 26 50 62 65 75 100 109 ...
..@ p : int [1:17626] 0 271 313 1093 1169 1234 1549 1603 1698 1797 ...
..@ Dim : int [1:2] 3500 17625
..@ Dimnames:List of 2
.. ..$ : chr [1:3500] "S3_AAACCCACACTACACA" "S3_AAACCCAGTACTAAGA" "S3_AAACCCATCGTTTACT" "S3_AAACCCATCTTGGAAC" ...
.. ..$ : chr [1:17625] "AL627309.1" "AC114498.1" "AL669831.5" "LINC00115" ...
..@ x : Named num [1:9283290] 0.1539 0.039 0.0366 0.2743 0.0311 ...
.. ..- attr(*, "names")= chr [1:9283290] "S3_AAAGGATCACGCAAAG" "S3_AAAGGATGTAGCTCGC" "S3_AAAGGGCTCGGTTGTA" "S3_AAAGGTACACGCACCA" ...
..@ factors : list()
$S1
NULL

$S2
NULL

$S3
NULL

sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /cm/shared/apps/intel/parallel_studio_xe/2018_update2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64_lin/libmkl_gf_lp64.so

locale:
[1] C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] conos_0.0.0.9002 igraph_1.2.4 Matrix_1.2-15 RLinuxModules_0.2

loaded via a namespace (and not attached):
[1] mclust_5.4.3 Rcpp_1.0.1 mvtnorm_1.0-10
[4] lattice_0.20-38 GO.db_3.7.0 class_7.3-14
[7] assertthat_0.2.1 digest_0.6.18 mime_0.6
[10] R6_2.4.0 plyr_1.8.4 stats4_3.5.0
[13] RSQLite_2.1.1 ggplot2_3.1.1 pillar_1.3.1
[16] rlang_0.3.4 lazyeval_0.2.2 diptest_0.75-7
[19] irlba_2.3.3 whisker_0.3-2 blob_1.1.1
[22] kernlab_0.9-27 S4Vectors_0.20.1 urltools_1.7.2
[25] triebeard_0.3.0 bit_1.1-14 munsell_0.5.0
[28] shiny_1.3.0 compiler_3.5.0 httpuv_1.5.1
[31] pkgconfig_2.0.2 BiocGenerics_0.28.0 base64enc_0.1-3
[34] pcaMethods_1.74.0 htmltools_0.3.6 nnet_7.3-12
[37] tidyselect_0.2.5 tibble_2.1.1 gridExtra_2.3
[40] pagoda2_0.0.0.9002 IRanges_2.16.0 dendextend_1.10.0
[43] viridisLite_0.3.0 crayon_1.3.4 dplyr_0.8.0.1
[46] later_0.8.0 MASS_7.3-51.1 grid_3.5.0
[49] xtable_1.8-3 gtable_0.3.0 DBI_1.0.0
[52] magrittr_1.5 scales_1.0.0 dendsort_0.3.3
[55] viridis_0.5.1 promises_1.0.1 flexmix_2.3-15
[58] robustbase_0.93-4 brew_1.0-6 rjson_0.2.20
[61] tools_3.5.0 fpc_2.1-11.1 bit64_0.9-7
[64] Biobase_2.42.0 glue_1.3.1 trimcluster_0.1-2.1
[67] DEoptimR_1.0-8 purrr_0.3.2 Rook_1.1-1
[70] parallel_3.5.0 AnnotationDbi_1.44.0 colorspace_1.4-1
[73] cluster_2.0.7-1 prabclus_2.2-7 memoise_1.1.0
[76] modeltools_0.2-22

embedGraph() as UMAP, output is LargeVis

Hi,

I am using Conos to integrate Seurat objects.

I would like to use UMAP instead of LargeVis. When I specify the following parameters:

obj$embedGraph(method="UMAP", min.dist=0.01, spread=15, n.cores=1, min.prob.lower=1e-3)

I receive the following output:

Convert graph to adjacency list...
Done
Estimate nearest neighbors and commute times...
Estimating hitting distances: 15:13:42.
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Done.
Estimating commute distances: 15:18:49.
Hashing adjacency list: 15:18:49.
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Done.
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
Estimating distances: 15:19:37.
**************************************************|
Done
Done.
All done!: 15:22:28.
Done
Estimate UMAP embedding...
15:22:29 UMAP embedding parameters a = 0.02659 b = 0.7912
15:22:29 Read 87349 rows and found 1 numeric columns
15:22:30 Commencing smooth kNN distance calibration using 1 thread
15:22:39 Initializing from normalized Laplacian + noise
15:22:53 Commencing optimization for 1000 epochs, with 3550290 positive edges using 1 thread
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
15:34:41 Optimization finished
Done
Merging 25 samples
Adding pairwise alignments to 'conos.pairs' in miscellaneous data
Adding graph as 'SCT_mnn'
Adding graph embedding as largeVis
Adding clustering information
^M |======================================================================| 100%

Even though the embedding is being done with UMAP, when embedding the graph, it is done in LargeVis format?

Could anyone offer any advice?

Connor

Is current implementation of getCorrectionVector correct?

Hi @pkharchenko ,
I'm looking at the implementation of getCorrectionVector and wondering if the behaviour is correct. Do you remember what was the idea? Currently conos subtract mean (optionally, weighted) from logFoldChanges. But why does it suppose to work?
Let's say we have 3 clusters with logFoldChange of gene G1 c(0, 4, 8). After subtracting the mean we get c(-4, 0, 4). In my opinion, it doesn't fit definition "removing the constant effect between the same clusters". To speak about "constant effect" we need to subtract minimum, and only in the case if all changes have the same sign, i.e. for c(2, 4, 8) we need to subtract 2, but for c(-1, 10, 10), we shouldn't change the values.

Function plotGraph doesn't support "gene" parameter

I am using Conos as Docker Container. I cannot visualize gene expression on the joint graph embedding when running " con$plotGraph(gene='Eomes',title='Eomes expression' ) ". It shows " Warning: Ignoring unknown parameters: gene ". It´s no problem when I run "con$plotPanel(gene = 'Eomes')" .How I can fix it? Thank you for help!

Reduce in con$plotPanel lost the original cell groups

When playing with Seurat objects in conos, the cell groups ids in con$plotPanel are reassigned, but not taking the cell ids previously provided by Seurat. Seurat uses number as factor to label cell groups.

In con$plotPanel,

Class method definition for method plotPanel()
function (clustering = NULL, groups = NULL, colors = NULL, gene = NULL, 
    use.local.clusters = FALSE, plot.theme = NULL, ...) 
{
    if (use.local.clusters) {
        if (is.null(clustering) && !(inherits(x = samples[[1]], 
            what = c("seurat", "Seurat")))) {
            stop("You have to provide 'clustering' parameter to be able to use local clusters")
        }
        groups <- Reduce(c, lapply(samples, getClustering, clustering))
        if (is.null(groups)) {
            stop(paste0("No clustering '", clustering, "' presented in the samples"))
        }
    }
    else if (is.null(groups) && is.null(colors) && is.null(gene)) {
        groups <- getClusteringGroups(clusters, clustering)
    }
    gg <- plotSamples(samples, groups = groups, colors = colors, 
        gene = gene, plot.theme = adjustTheme(plot.theme), ...)
    return(gg)
}

groups <- Reduce(c, lapply(samples, getClustering, clustering)) will convert the factor to numbers taking the storage oder, and we lost the cluster identities. Is this the intended behaviour or a bug?

Correct labeling in plotClusterBarplots

In plotClusterBarplots 'entropy' and 'number of cells' plots do not have x-axis factorized properly (whereas 'fraction of cells' do). In the file plot.R in line 336 we call:

x=factor(cluster, levels=1:ncol(xt))

which produces the correct ordering of cluster labels in the main plot. However, in lines 352 and 359:

cluster=as.factor(colnames(xt))

which gives the order: 1, 10, 11, 2, 3 …

analysis of multiple time points * 2 replicates of scRNA-seq

Dear all,

'd appreciate having your suggestions on the following case of scRNAseq analysis with CONOS /Seurat 3.1. ; the question is : what analysis strategy would you recommend (described below) ?

shall we have 4 batches of scRNA-seq data of these experiments :

WT_batch1, WT_batch2, A_batch1, A_batch2

WT_batch3, WT_batch4, B_batch3, B_batch4

what is the optimal way to analyze the data-sets ? Several analysis strategies are possible :

STRATEGY A.

  1. to use CELLRANGER AGGR (with NORMALIZATION = TRUE) on :

WT_batch1, WT_batch2 : to produce WT_batch_1_2

A_batch1, A_batch2 : to produce A_batch_1_2

WT_batch3, WT_batch4 : to produce WT_batch_3_4

B_batch3, B_batch4 : to produce B_batch_3_4

  1. and to follow the descriptions of SEURAT pipelines with CONOS, on WT_batch_1_2, WT_batch_3_4, A_batch_1_2, B_batch_3_4 :

https://htmlpreview.github.io/?https://github.com/satijalab/seurat.wrappers/blob/master/docs/conos.html

STRATEGY B.

  1. to use SEURAT MERGE function in order to have all the raw data (WT_batch1, WT_batch2, A_batch1, A_batch2, WT_batch3, WT_batch4, B_batch3, B_batch4) in a large MATRIX

  2. could I apply CONOS on all the experiments with those 2 replicates, and afterwards, how could I call the function FindMarkers in SEURAT (FindMarkers(object, ident.1, ident.2), in order to specify the REPLICATES in ident.1 and in ident.2 ?

Any other analysis strategy ? Any suggestions, comments would be very welcome ! Thanks a lot !

bogdan

conos via docker container

Hi,
I'm trying to install and use conos as a docker container. Followed the directions and stuck at the localhost:8787 point--being asked for an RStudio login ID and password. ID and password for which system? Tried pw for my laptop and for docker--neither worked.
Pls s
screen shot 2019-01-17 at 1 36 56 pm
ee attached screenshot

Aligning Count and TPM matrices

Hello,

I've been testing Conos for alignment of 16 large, GEO deposited, single-cell sequencing experiments. This appears to be working quite well, however, I noticed that many of the functions in conos/pagoda2 assume (or declare) data.type == 'counts'. Frequently, GEO deposited scRNA data is deposited as a TPM matrix, not raw counts, and the data we're integrating is a mix of count and TPM data. The data appears to align into well-mixed co-clusters, but I was wondering if this scenario has been tested at all and if there are any special considerations/adjustments I should be making to the function parameters for the TPM samples.

Also, how might this be affecting the per-cluster markers called with getDifferentialGenes? The called cluster markers appear fairly robust, and many recapitulate cell-type markers defined in the relevant literature, but some do not, and quite a few are listed as "<NA>" (which may be a different issue).

Overall though, the results from this application are extremely impressive!

Thanks!

Exclude celltypes in getCorrectionVector earlier

In my particular dataset I have a celltype that consists mostly of one sample type. Therefore, getPerCellTypeDE finds no DE genes for it and outputs an empty vector. That's fine, however, when running getCorrectionVector I would get an error in
https://github.com/hms-dbmi/conos/blob/2c60c4cc9e69ac60b29dbb5f23d46037a33e0642/R/de.functions.R#L123-L130
, since it cannot do the operation for an empty vector.

I thought a solution to that would be to exclude that celltype through the exclude.celltypes parameter, however, this parameter is used after the aforementioned code that caused an error.

The solution to that would be to exclude celltypes at an earlier step, which I implement in the pull request #18

Unable to install on Mac due to -fopenmp

I'm really interested in trying Conos, but am unable to install Conos on my Macbook. I have followed the instructions for installing dependencies as outlined here: https://github.com/hms-dbmi/pagoda2 , but I continue to get this error message:

clang: error: unsupported option '-fopenmp'
make[1]: *** [base.o] Error 1
make: *** [sublibraries] Error 1
ERROR: compilation failed for package ‘conos’
* removing ‘/Library/Frameworks/R.framework/Versions/3.6/Resources/library/conos’
Error: Failed to install 'conos' from GitHub:
  (converted from warning) installation of package ‘/var/folders/6h/50x3k8nn4mdgm16fqqkq5qlc0000gq/T//RtmpdUQFgW/file3aad12dcb032/conos_1.2.1.tar.gz’ had non-zero exit status

What can I do to install on Mac? Thank you!!

buildGraph fails for large matrices

Hi there!

I just installed Conos and am excited to test it out on some data. It seems like the buildGraph function fails if your matrix exceeds a certain size. I tried aggregating a few smaller datasets along with a large one (~70k cells), and got the following error:

"
found 0 out of 6 cached PCA space pairs ... running 6 additional PCA space pairs done
inter-sample links using mNN done
local pairs Error in papply(samples, function(x) { :
Errors in papply: Error in n2Knn(pca, k.self + 1, 1, FALSE) :
SpMat::init(): requested size is too large
"

The error goes away if I don't include that large dataset and run the function on just the other ones. Not sure if there's a good fix for that, beyond maybe splitting the large object up into smaller chunks?

Any input would be appreciated! Thanks!

Gabriela

Install error

Hi Sir,
I met an error as below when installed Conos in Linux system. Could you give me some suggestion to solve the problem? Thanks!

Error : (converted from warning) /tmp/RtmpzKRDNH/R.INSTALL5a646ef073ba/conos/man/getClusterPrivacy.Rd:6: unexpected section header '\usage'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.