Giter Site home page Giter Site logo

ltla / batchelor Goto Github PK

View Code? Open in Web Editor NEW
16.0 16.0 7.0 825 KB

Clone of the Bioconductor repository for the batchelor package.

Home Page: https://bioconductor.org/packages/devel/bioc/html/batchelor.html

R 95.73% C++ 3.69% C 0.03% TeX 0.54%
batch-correction bioconductor-package human-cell-atlas single-cell-rna-seq

batchelor's Introduction

batchelor's People

Contributors

jwokaty avatar ltla avatar nturaga avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

batchelor's Issues

Remaining refactoring items to do:

  • Add a no-correction method with the same interface for easy comparison.
  • Allow specification of PCA weights in fastMNN().
  • Switch mnnCorrect() to use trees, strip out the distinction between input/output.

Using multiple datasets for fastMNN

Hi,

I am trying to use multiple datasets for input as fastMNN. I can run it successfully with the below code.

mnn.out <- fastMNN(sce1,sce2)

As I have more than 10 datasets, and it will be an hassle to de type their names. Secondly, the batch variable is a numeric array rather than character array in the output.

I tried following the OSCA tutorial way to first create a list, process the individual elements of list and then use that as input.
list.sce <-list(A=sce1,B=sce2)
mnn.out <- fastMNN(list.sce)

But, this gave me an error that I need to provide batch information. However, in the example in tutorial
mnn.pancreas <- fastMNN(normed.pancreas)
There is no batch information given and still the batch appear as characters.

When I provided batch information as character or factor array, both time it gave me an error regarding difference in dimensions. I created the array repeating sample ID times the number of cells in each sample.

Could you suggest what could be the solution?

recuedMNN Error: C stack usage 7978148 is too close to the limit.

Hello, thank you for the amazing package! However we ran into some issue when I was running reducedMNN() with my own PCA embeddings. I hit an error saying ERROR: C stack usage 7978148 is too close to the limit.
When we looked into the function we realized for reducedMNN, ultimately .create_tree_predefined is called and that calls .binarize_tree and .fill_tree. Both of those two functions are recursive.
Any thought on fixing this problem? Thanks!

could not find function "quickCorrect"

I finished installing the "batchelor" toolbox, without any bugs. But, when I use "quickCorrect" function, it showed
could not find function "quickCorrect"
It's very weird, as this is a very important function of this toolbox. Where is it?
Thanks,
Cain

Using MNN corrected data for per cell pathway scoring

Hi MNN group,

I see in the MNN help pages that it is not recommended to use MNN-corrected gene expression matrix for quantitative analysis such as differential gene expression. But I wonder if it is reasonable to use it to calculate pathway scores in each cell (i.e. in each cell, transform the gene expression into pathway scores based on certain pre-defined gene-pathway mappings). This process does not involve any between-cell comparisons. Does this sound reasonable to you?

Thanks you,
Jack

Unexpected number of cells produced in fastMNN

I tried to run fastMNN (installed today from bioconductor), but I get more cells than expecte din the corrected object. Any idea why?

sobj <- subset(data, features=hvg)

print('sobj')

print(sobj)

expr <- GetAssayData(object = sobj ,slot = "data")

print('expr')

print(dim(expr))

sce <- fastMNN(expr, batch = [email protected][[batch]])

print('sce')

print(dim(sce))

[1] "sobj"
An object of class Seurat 
2000 features across 2730 samples within 1 assay 
Active assay: originalexp (2000 features, 0 variable features)
[1] "expr"
[1] 2000 2730
[1] "sce"
[1] 2000 9815

How to convert corrected data to positive values?

Dear developer,

Thank you a lot for the the great tool! I have two datasets with very different cell compositions (one dataset contains only T cells, and the other dataset has diverse cell types), and they benefited a lot from fastMNN integration.

Now I'd like to use the corrected data to understand how the cells from the 1st dataset communicate with malignant cells in the 2nd using CellPhoneDB/CellChat. However, both tools only accept normalized counts as input, and negative values are not allowed. Do you have advice on how to convert the corrected data to positive values?

Thank you very much!

Correcting for multiple batches

Hi,

I was hoping you could give some advice for how to correct for multiple batches. I have batches from experimental procedures (samples processed in two batches) and due to a sequencing error, the sequencing was also done in two batches. The breakdown is as follows (Y axis - experimental batches, X axis - sequencing batches)

Batch1 Batch2

1 19 23
2 15 5

Currently, I've read in the two sequencing runs separately, correct on them (correction worked beautifully!), extracted the data and created two new separate objects based on the experimental batches. The second batch did not correct very much. I'm used to doing my scSEQ analysis on Seurat so after batch correcting with batchelor, I extracted the corrected counts to make a Seurat object so after the first correction, Seurat is still showing an influence from experimental batch but when I look at the pre-correction PCA separated on experimental batch with the scater/batchelor workflow, it doesn't detect a difference.

I can provide some code and figures if it would help but don't want to bore you if the approach is completely wrong.

Your advice is greatly appreciated!

Thanks,
-Frances

SummarizedExperiment::assay: on fastMNN object : object of type 'S4' is not subsettable

Hello,

I have an issue with the 'fastMNN' function in batchelor.
I can't access the assay of the generated object as I get an error.

out= batchelor::fastMNN(objects.sce)
out

class: SingleCellExperiment
dim: 2000 10000
metadata(2): merge.info pca.info
assays(1): reconstructed
rownames(2000): PTGDS S100B ... BCAS1 AGPAT5
rowData names(1): rotation
colnames(10000): AGAGCGAGTTAGATGA-1 ATTGGTGCAATCTACG-1 ... TTAGGCATCATGGTCA-1
TGGCCAGCAATGGACG-1
colData names(1): batch
reducedDimNames(1): corrected
mainExpName: NULL
altExpNames(0):

assay(out)

Error in (function (cond) : erreur d'évaluation de l'argument 'x' lors de la sélection d'une méthode pour la fonction 'type' : object of type 'S4' is not subsettable

When I manually create an sce object, I don't encounter this problem.
Similarly, when I convert a Seurat object into an sce object, there are no issues accessing the assays.

Here's my SessionInfo()

R version 4.3.1 (2023-06-16 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale:
[1] LC_COLLATE=French_France.utf8 LC_CTYPE=French_France.utf8
[3] LC_MONETARY=French_France.utf8 LC_NUMERIC=C
[5] LC_TIME=French_France.utf8

time zone: Europe/Paris
tzcode source: internal

attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base

other attached packages:
[1] SeuratWrappers_0.3.1 genefilter_1.82.1 magick_2.8.0
[4] rgl_1.2.1 destiny_3.14.0 sinaplot_1.1.0
[7] plyr_1.8.8 RColorBrewer_1.1-3 ggplot2_3.4.3
[10] DESeq2_1.40.2 patchwork_1.1.3 UCell_2.4.0
[13] harmony_1.0.1 Rcpp_1.0.11 sscVis_0.1.0
[16] SingleCellExperiment_1.22.0 SummarizedExperiment_1.30.2 Biobase_2.60.0
[19] GenomicRanges_1.52.0 GenomeInfoDb_1.36.3 IRanges_2.34.1
[22] S4Vectors_0.38.2 BiocGenerics_0.46.0 MatrixGenerics_1.12.3
[25] matrixStats_1.0.0 dplyr_1.1.2 SeuratObject_4.1.3
[28] Seurat_4.3.0.1 markdown_1.8 knitr_1.44

loaded via a namespace (and not attached):
[1] spatstat.sparse_3.0-2 bitops_1.0-7 httr_1.4.7
[4] doParallel_1.0.17 dynamicTreeCut_1.63-1 tools_4.3.1
[7] sctransform_0.4.0 backports_1.4.1 ResidualMatrix_1.10.0
[10] utf8_1.2.3 R6_2.5.1 lazyeval_0.2.2
[13] uwot_0.1.16 GetoptLong_1.0.5 withr_2.5.0
[16] sp_2.0-0 gridExtra_2.3 progressr_0.14.0
[19] cli_3.6.1 spatstat.explore_3.2-3 sass_0.4.7
[22] mvtnorm_1.2-3 robustbase_0.99-0 spatstat.data_3.0-1
[25] proxy_0.4-27 ggridges_0.5.4 pbapply_1.7-2
[28] R.utils_2.12.2 parallelly_1.36.0 maps_3.4.1
[31] limma_3.56.2 TTR_0.24.3 RSQLite_2.3.1
[34] rstudioapi_0.15.0 impute_1.74.1 generics_0.1.3
[37] shape_1.4.6 ica_1.0-3 spatstat.random_3.1-6
[40] car_3.1-2 dendextend_1.17.1 Matrix_1.6-1.1
[43] ggbeeswarm_0.7.2 fansi_1.0.4 abind_1.4-5
[46] R.methodsS3_1.8.2 lifecycle_1.0.3 scatterplot3d_0.3-44
[49] yaml_2.3.7 carData_3.0-5 Rtsne_0.16
[52] blob_1.2.4 grid_4.3.1 promises_1.2.1
[55] crayon_1.5.2 miniUI_0.1.1.1 lattice_0.21-8
[58] beachmat_2.16.0 cowplot_1.1.1 annotate_1.78.0
[61] KEGGREST_1.40.0 pillar_1.9.0 ComplexHeatmap_2.16.0
[64] rjson_0.2.21 boot_1.3-28.1 future.apply_1.11.0
[67] codetools_0.2-19 leiden_0.4.3 glue_1.6.2
[70] remotes_2.4.2.1 pcaMethods_1.92.0 data.table_1.14.8
[73] vcd_1.4-11 vctrs_0.6.3 png_0.1-8
[76] spam_2.9-1 gtable_0.3.4 cachem_1.0.8
[79] ks_1.14.1 xfun_0.40 S4Arrays_1.0.6
[82] mime_0.12 RcppEigen_0.3.3.9.3 pracma_2.4.2
[85] survival_3.5-7 iterators_1.0.14 fields_15.2
[88] ellipsis_0.3.2 fitdistrplus_1.1-11 ROCR_1.0-11
[91] nlme_3.1-163 xts_0.13.1 bit64_4.0.5
[94] RcppAnnoy_0.0.21 bslib_0.5.1 irlba_2.3.5.1
[97] vipor_0.4.5 KernSmooth_2.23-22 DBI_1.1.3
[100] colorspace_2.1-0 nnet_7.3-19 smoother_1.1
[103] ggrastr_1.0.2 tidyselect_1.2.0 bit_4.0.5
[106] extrafontdb_1.0 curl_5.0.2 compiler_4.3.1
[109] BiocNeighbors_1.18.0 DelayedArray_0.26.7 plotly_4.10.2
[112] scales_1.2.1 hexbin_1.28.3 DEoptimR_1.1-2
[115] lmtest_0.9-40 stringr_1.5.0 digest_0.6.33
[118] goftest_1.2-3 spatstat.utils_3.0-3 rmarkdown_2.25
[121] XVector_0.40.0 RhpcBLASctl_0.23-42 base64enc_0.1-3
[124] htmltools_0.5.6 pkgconfig_2.0.3 extrafont_0.19
[127] sparseMatrixStats_1.12.2 fastmap_1.1.1 ggthemes_4.2.4
[130] rlang_1.1.1 GlobalOptions_0.1.2 htmlwidgets_1.6.2
[133] shiny_1.7.5 DelayedMatrixStats_1.22.6 jquerylib_0.1.4
[136] zoo_1.8-12 jsonlite_1.8.7 BiocParallel_1.34.2
[139] mclust_6.0.0 R.oo_1.25.0 BiocSingular_1.16.0
[142] RCurl_1.98-1.12 magrittr_2.0.3 scuttle_1.10.2
[145] GenomeInfoDbData_1.2.10 dotCall64_1.0-2 munsell_0.5.0
[148] viridis_0.6.4 reticulate_1.32.0 stringi_1.7.12
[151] zlibbioc_1.46.0 MASS_7.3-60 parallel_4.3.1
[154] listenv_0.9.0 ggrepel_0.9.3 deldir_1.0-9
[157] Biostrings_2.68.1 splines_4.3.1 tensor_1.5
[160] circlize_0.4.15 locfit_1.5-9.8 ranger_0.15.1
[163] igraph_1.5.1 ggpubr_0.6.0 spatstat.geom_3.2-5
[166] ggsignif_0.6.4 RcppHNSW_0.5.0 ScaledMatrix_1.8.1
[169] reshape2_1.4.4 XML_3.99-0.14 evaluate_0.21
[172] BiocManager_1.30.22 laeken_0.5.2 batchelor_1.16.0
[175] foreach_1.5.2 httpuv_1.6.11 Rttf2pt1_1.3.12
[178] VIM_6.2.2 RANN_2.6.1 tidyr_1.3.0
[181] purrr_1.0.2 polyclip_1.10-4 future_1.33.0
[184] clue_0.3-65 scattermore_1.2 gridBase_0.4-7
[187] rsvd_1.0.5 broom_1.0.5 xtable_1.8-4
[190] e1071_1.7-13 RSpectra_0.16-1 rstatix_0.7.2
[193] later_1.3.1 viridisLite_0.4.2 class_7.3-22
[196] tibble_3.2.1 moduleColor_1.8-4 memoise_2.0.1
[199] AnnotationDbi_1.62.2 beeswarm_0.4.0 cluster_2.1.4
[202] ggplot.multistats_1.0.0 globals_0.16.2

Thank you for your help

multiBatchPCA: Non-alphabetically sorted batch levels give error when used with list-style weights

suppressPackageStartupMessages(library(batchelor))

d1 <- matrix(rnorm(5000), ncol=100)
d1[1:10,1:10] <- d1[1:10,1:10] + 2 # unique population in d1
d2 <- matrix(rnorm(2000), ncol=40)
d2[11:20,1:10] <- d2[11:20,1:10] + 2 # unique population in d2
d3 <- d2 + 5

mat <- cbind(d1, d2, d3)
b <- c(rep("D1", ncol(d1)), rep("D2", ncol(d2)), rep("D3", ncol(d3)))
w <- list("D1", list("D2", "D3"))

# multiBatchPCA()
# Ok
multiBatchPCA(mat, batch = b, weights = w, d = 10)
#> List of length 3
#> names(3): D1 D2 D3
# Ok
multiBatchPCA(mat, batch = factor(b), weights = w, d = 10)
#> List of length 3
#> names(3): D1 D2 D3
# Errors
multiBatchPCA(mat, batch = factor(b, levels = c("D2", "D1", "D3")), weights = w, d = 10)
#> Error in .construct_weight_vector(tab, weights): names in tree-like 'weights' do not match names in '...'
# Ok
multiBatchPCA(mat, batch = as.character(factor(b, levels = c("D2", "D1", "D3"))), weights = w, d = 10)
#> List of length 3
#> names(3): D1 D2 D3

# Source of error
# Ok
batchelor:::.construct_weight_vector(table(b), w)
#>   D1   D2   D3 
#> 0.50 0.25 0.25
# Ok
batchelor:::.construct_weight_vector(table(factor(b)), w)
#>   D1   D2   D3 
#> 0.50 0.25 0.25
# Errors
batchelor:::.construct_weight_vector(table(factor(b, levels = c("D2", "D1", "D3"))), w)
#> Error in batchelor:::.construct_weight_vector(table(factor(b, levels = c("D2", : names in tree-like 'weights' do not match names in '...'
# Ok
batchelor:::.construct_weight_vector(table(as.character(factor(b, levels = c("D2", "D1", "D3")))), w)
#>   D1   D2   D3 
#> 0.50 0.25 0.25

Created on 2023-07-11 with reprex v2.0.2

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.3.1 (2023-06-16)
#>  os       Ubuntu 22.04.2 LTS
#>  system   x86_64, linux-gnu
#>  ui       X11
#>  language en_AU:en
#>  collate  en_AU.UTF-8
#>  ctype    en_AU.UTF-8
#>  tz       Australia/Melbourne
#>  date     2023-07-11
#>  pandoc   3.1.1 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package              * version   date (UTC) lib source
#>  batchelor            * 1.16.0    2023-04-25 [1] Bioconductor
#>  beachmat               2.16.0    2023-04-25 [3] Bioconductor
#>  Biobase              * 2.60.0    2023-04-25 [3] Bioconductor
#>  BiocGenerics         * 0.46.0    2023-04-25 [3] Bioconductor
#>  BiocNeighbors          1.18.0    2023-04-25 [3] Bioconductor
#>  BiocParallel           1.34.2    2023-05-22 [1] Bioconductor
#>  BiocSingular           1.16.0    2023-04-25 [3] Bioconductor
#>  bitops                 1.0-7     2021-04-24 [3] RSPM (R 4.2.0)
#>  cli                    3.6.1     2023-03-23 [3] RSPM (R 4.2.0)
#>  codetools              0.2-19    2023-02-01 [3] RSPM (R 4.2.0)
#>  crayon                 1.5.2     2022-09-29 [3] RSPM (R 4.2.0)
#>  DelayedArray           0.26.3    2023-05-22 [1] Bioconductor
#>  DelayedMatrixStats     1.22.1    2023-06-09 [1] Bioconductor
#>  digest                 0.6.32    2023-06-26 [3] RSPM (R 4.2.0)
#>  evaluate               0.21      2023-05-05 [3] RSPM (R 4.2.0)
#>  fastmap                1.1.1     2023-02-24 [3] RSPM (R 4.2.0)
#>  fs                     1.6.2     2023-04-25 [3] RSPM (R 4.2.0)
#>  GenomeInfoDb         * 1.36.1    2023-06-21 [3] Bioconductor
#>  GenomeInfoDbData       1.2.10    <NA>       [3] Bioconductor
#>  GenomicRanges        * 1.52.0    2023-04-25 [3] Bioconductor
#>  glue                   1.6.2     2022-02-24 [3] RSPM (R 4.2.0)
#>  htmltools              0.5.5     2023-03-23 [3] RSPM (R 4.2.0)
#>  igraph                 1.5.0     2023-06-16 [1] CRAN (R 4.3.0)
#>  IRanges              * 2.34.1    2023-06-22 [3] Bioconductor
#>  irlba                  2.3.5.1   2022-10-03 [3] RSPM (R 4.2.0)
#>  knitr                  1.43      2023-05-25 [3] RSPM (R 4.2.0)
#>  lattice                0.21-8    2023-04-05 [3] RSPM (R 4.2.0)
#>  lifecycle              1.0.3     2022-10-07 [3] RSPM (R 4.2.0)
#>  magrittr               2.0.3     2022-03-30 [3] RSPM (R 4.2.0)
#>  Matrix                 1.5-4.1   2023-05-18 [3] RSPM (R 4.2.0)
#>  MatrixGenerics       * 1.12.2    2023-06-09 [1] Bioconductor
#>  matrixStats          * 1.0.0     2023-06-02 [3] RSPM (R 4.2.0)
#>  pkgconfig              2.0.3     2019-09-22 [3] CRAN (R 4.0.1)
#>  purrr                  1.0.1     2023-01-10 [3] RSPM (R 4.2.0)
#>  R.cache                0.16.0    2022-07-21 [3] RSPM (R 4.2.0)
#>  R.methodsS3            1.8.2     2022-06-13 [3] RSPM (R 4.2.0)
#>  R.oo                   1.25.0    2022-06-12 [3] RSPM (R 4.2.0)
#>  R.utils                2.12.2    2022-11-11 [3] RSPM (R 4.2.0)
#>  Rcpp                   1.0.10    2023-01-22 [3] RSPM (R 4.2.0)
#>  RCurl                  1.98-1.12 2023-03-27 [3] RSPM (R 4.2.0)
#>  reprex                 2.0.2     2022-08-17 [3] RSPM (R 4.2.0)
#>  ResidualMatrix         1.4.0     2021-10-26 [3] Bioconductor
#>  rlang                  1.1.1     2023-04-28 [3] RSPM (R 4.2.0)
#>  rmarkdown              2.23      2023-07-01 [3] RSPM (R 4.2.0)
#>  rstudioapi             0.14      2022-08-22 [3] RSPM (R 4.2.0)
#>  rsvd                   1.0.5     2021-04-16 [3] RSPM (R 4.2.0)
#>  S4Arrays               1.0.4     2023-05-14 [1] Bioconductor
#>  S4Vectors            * 0.38.1    2023-05-02 [3] Bioconductor
#>  ScaledMatrix           1.8.1     2023-05-03 [1] Bioconductor
#>  scuttle                1.10.1    2023-05-02 [1] Bioconductor
#>  sessioninfo            1.2.2     2021-12-06 [3] RSPM (R 4.2.0)
#>  SingleCellExperiment * 1.22.0    2023-04-25 [3] Bioconductor
#>  sparseMatrixStats      1.12.0    2023-04-25 [3] Bioconductor
#>  styler                 1.10.1    2023-06-05 [1] CRAN (R 4.3.0)
#>  SummarizedExperiment * 1.30.2    2023-06-06 [3] Bioconductor
#>  vctrs                  0.6.3     2023-06-14 [3] RSPM (R 4.2.0)
#>  withr                  2.5.0     2022-03-03 [3] RSPM (R 4.2.0)
#>  xfun                   0.39      2023-04-20 [3] RSPM (R 4.2.0)
#>  XVector                0.40.0    2023-04-25 [3] Bioconductor
#>  yaml                   2.3.7     2023-01-23 [3] RSPM (R 4.2.0)
#>  zlibbioc               1.46.0    2023-04-25 [3] Bioconductor
#> 
#>  [1] /home/peter/R/x86_64-pc-linux-gnu-library/4.3
#>  [2] /usr/local/lib/R/site-library
#>  [3] /usr/lib/R/site-library
#>  [4] /usr/lib/R/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

I'm unsure if the fix as simple as making the comparison be against sort(names(ncells)) in the following

if (!identical(unname(sort(ids)), names(ncells))) {

because I get a bit lost tracking this one down through the internal function calls.

Would this also be applicable to bulk data?

Hi,

great package. I was wondering if this form of batch integration is also applicable to bulk RNA-Seq data. Sure the data is less sparse, but would that be an issue?

Happy about any feedback!
Best,
M

Error Installing batchelor

Hello there batchelor Team!

I'm trying to install batchelor. However, when I run devtools::install_github("LTLA/batchelor"), I get the following error output:

Downloading GitHub repo LTLA/batchelor@master
Skipping 2 packages not available: BiocNeighbors, BiocSingular
Skipping 11 packages ahead of CRAN: beachmat, BiocGenerics, BiocParallel, DelayedArray, gtable, HDF5Array, IRanges, rhdf5, Rhdf5lib, rlang, S4Vectors
Installing 17 packages: Biobase, BiocSingular, DelayedMatrixStats, edgeR, GenomeInfoDb, GenomeInfoDbData, GenomicRanges, limma, locfit, rjson, scater, shinydashboard, SingleCellExperiment, SummarizedExperiment, tximport, XVector, zlibbioc
Installing packages into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)
Error: (converted from warning) package ‘BiocSingular’ is not available (for R version 3.5.3)

I've then tried to install biocSingular. However, I get the following error:

In file included from /usr/local/lib/R/site-library/beachmat/include/beachmat/all_readers.h:4:0,
                 from /usr/local/lib/R/site-library/beachmat/include/beachmat/LIN_matrix.h:4,
                 from /usr/local/lib/R/site-library/beachmat/include/beachmat/numeric_matrix.h:4,
                 from compute_scale.cpp:2:
/usr/local/lib/R/site-library/beachmat/include/beachmat/beachmat.h:15:19: fatal error: H5Cpp.h: No such file or directory
 #include "H5Cpp.h"
                   ^
compilation terminated.
/usr/local/lib/R/etc/Makeconf:172: recipe for target 'compute_scale.o' failed
make: *** [compute_scale.o] Error 1
ERROR: compilation failed for package ‘BiocSingular’
* removing ‘/usr/local/lib/R/site-library/BiocSingular’
Error in i.p(...) : 
  (converted from warning) installation of package ‘/tmp/RtmpGJQkfa/file1977a4d39ec/BiocSingular_0.99.14.tar.gz’ had non-zero exit status

Does anyone know a way to solve this? Any help will be really appreciated.

Thanks in advance!

Davi

Find maker genes using fastMNN output matrix

Hi batchelor team,

I have applied your aggregation mothod fastMNN on my six batches of 10x data. MNN is the best mothod for my data set, comparing with CCA and scMerge. It worked quite well in terms of downstream clustering! But I get some difficulties on finding marker genes of clusters, since most of the methons prefer to use the counts/logNormalized counts matrix. Do you have any mothod recommended for finding marker genes based on your output matrix, which is scaled and centred to zero?

Regards,
Nelosn

Reintegration of subset causing very large file size

Hello,

Thank you for this great tool.

I'm working to analyze multiple samples from different conditions and compare to published data, in total there are 240k cells over ~40 samples. Four broad cell types (epithelial, stromal, vascular, and immune) are present in many but not all samples (e.g., samples are not balanced by cell type). I have normalized in Seurat with SCTv2 and then integrated with FastMNN via the SeuratWrappers function, and FastMNN does a great job. Other methods have led to significant misclassification of cell types.

My issue and request for help is related to subsetting. I generated cell type specific Variable Feature using subsets from a few samples with 3-5k cells in that type, and then use than in rerunning FastMNN on a subsetted object as follows. Note that the DietSeurat function strips the Seurat object of all other assays/reductions, such as the original MNN. The issue is my output object is enormous now, 140gb for 55k cells. I tried with another cell type with only 5k cells and the object with 30gb. The starting integrated object is about 14gb, with all 240k cells. My guess is that this may be related to some samples having fewer than 50 cells after subsetting, but I am not sure. And note that a few samples only had 100-300 cells in the initial object creation so perhaps low cells/sample isn't the issue after all.

mnn_integrated_object <- readRDS(file = "xxx")
mnn_integrated_object <- DietSeurat(mnn_integrated_object, counts = TRUE, data = TRUE, assays = c("RNA", "SCT"))
gc()
subset_mnn <- subset(mnn_integrated_object, cells = Cell_Type_1_List)
DefaultAssay(subset_mnn) <- "SCT"
subset_mnn <- RunFastMNN(object.list = SplitObject(subset_mnn, split.by = "orig.ident"), features = VariableFeatures, assay = "SCT", verbose = TRUE)
subset_mnn <- RunUMAP(subset_mnn, reduction = "mnn", dims = 1:50, min.dist = 0.3)
subset_mnn <- FindNeighbors(subset_mnn, reduction = "mnn", dims = 1:50)
subset_mnn <- FindClusters(subset_mnn, resolution = 1.0)

Any idea what is going on? Of course I could just skip rerunning FastMNN on the subset but I suspect there is interesting biology present and would prefer to redo the PCA/Integration steps.

Many thanks!

confusion about the smooth_gaussian_kernel.cpp with mnnCorrect()

I want to implement the mnnCorrect() function with R(without C++), so I read the source code of smooth_gaussian_kernel.cpp. In the smooth_gaussian_kernel function:
const size_t ngenes=averaged.nrow();
const size_t nmnn=averaged.ncol();
if (nmnn!=index.size()) { throw std::runtime_error("'index' must have length equal to number of rows in 'averaged'"); }

the variable nmnn represents the column of "averaged", why it should be equal to number of rows in "averaged" in the if condition?

When I debug the .cpp source file with lldb, I found I have to transpose the "averaged" before I call the smooth_gaussion_kernel function in the smooth_gaussian_kernel.cpp, otherwise It will throw the error 'index' must have length equal to number of rows in 'averaged'. but in the R code where mnnCorrect call this C++ function.
function (data1, data2, mnn1, mnn2, tdata2, sigma) { vect <- data1[mnn1, , drop = FALSE] - data2[mnn2, , drop = FALSE] cell.vect <- .Call(cxx_smooth_gaussian_kernel, vect, mnn2 - 1L, tdata2, sigma) t(cell.vect) }
It didn't transpose the vect(in my simulation data,the number of columns of vect represents the number of genes ), but it give the correct result and didn't throw an error, I get confused and dont know why this happens? In a word, I have two problems,

  1. I think the if condition should be
    if n_genes==index.size() instead of if (nmnn!=index.size()),
  2. I should transpose with the first argument of the smooth_gaussion_kernel function(the vert or the averaged) or not ?

Originally posted by @yuxiaokang-source in #18 (comment)

batchelor Installation problem with scutlle

I have tried installing batchelor with this command, but with error:


> devtools::install_github("LTLA/batchelor")
Using github PAT from envvar GITHUB_PAT
Downloading GitHub repo LTLA/batchelor@master
Skipping 1 packages not available: scuttle
✓  checking for file ‘/tmp/RtmpceMH6Z/remotes2424a91d4bf/LTLA-batchelor-115a0f4/DESCRIPTION’ ...
─  preparing ‘batchelor’:
✓  checking DESCRIPTION meta-information ...
─  cleaning src
─  checking for LF line-endings in source and make files and shell scripts
─  checking for empty or unneeded directories
─  building ‘batchelor_1.5.1.tar.gz’
   
Installing package into ‘/home/ubuntu/R/x86_64-pc-linux-gnu-library/3.6’
(as ‘lib’ is unspecified)
ERROR: dependency ‘scuttle’ is not available for package ‘batchelor’
* removing ‘/home/ubuntu/R/x86_64-pc-linux-gnu-library/3.6/batchelor’
Error: Failed to install 'batchelor' from GitHub:
  (converted from warning) installation of package ‘/tmp/RtmpceMH6Z/file2424202709f/batchelor_1.5.1.tar.gz’ had non-zero exit status

To deal with the error I tried to install scuttle with the following,
But also with error:


> devtools::install_github("LTLA/scuttle")
Using github PAT from envvar GITHUB_PAT
Downloading GitHub repo LTLA/scuttle@master
✓  checking for file ‘/tmp/RtmpceMH6Z/remotes2424384a9729/LTLA-scuttle-7120e64/DESCRIPTION’ ...
─  preparing ‘scuttle’:
✓  checking DESCRIPTION meta-information ...
─  cleaning src
─  checking for LF line-endings in source and make files and shell scripts
─  checking for empty or unneeded directories
─  building ‘scuttle_0.99.9.tar.gz’
   
Installing package into ‘/home/ubuntu/R/x86_64-pc-linux-gnu-library/3.6’
(as ‘lib’ is unspecified)
* installing *source* package ‘scuttle’ ...
** using staged installation
** libs
g++ -std=gnu++11 -I"/usr/share/R/include" -DNDEBUG -I../inst/include/ -I"/home/ubuntu/R/x86_64-pc-linux-gnu-library/3.6/Rcpp/include" -I"/home/ubuntu/R/x86_64-pc-linux-gnu-library/3.6/beachmat/include"   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c RcppExports.cpp -o RcppExports.o
g++ -std=gnu++11 -I"/usr/share/R/include" -DNDEBUG -I../inst/include/ -I"/home/ubuntu/R/x86_64-pc-linux-gnu-library/3.6/Rcpp/include" -I"/home/ubuntu/R/x86_64-pc-linux-gnu-library/3.6/beachmat/include"   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c combined_qc.cpp -o combined_qc.o
g++ -std=gnu++11 -I"/usr/share/R/include" -DNDEBUG -I../inst/include/ -I"/home/ubuntu/R/x86_64-pc-linux-gnu-library/3.6/Rcpp/include" -I"/home/ubuntu/R/x86_64-pc-linux-gnu-library/3.6/beachmat/include"   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c downsample_counts.cpp -o downsample_counts.o
g++ -std=gnu++11 -I"/usr/share/R/include" -DNDEBUG -I../inst/include/ -I"/home/ubuntu/R/x86_64-pc-linux-gnu-library/3.6/Rcpp/include" -I"/home/ubuntu/R/x86_64-pc-linux-gnu-library/3.6/beachmat/include"   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c fit_linear_model.cpp -o fit_linear_model.o
g++ -std=gnu++11 -I"/usr/share/R/include" -DNDEBUG -I../inst/include/ -I"/home/ubuntu/R/x86_64-pc-linux-gnu-library/3.6/Rcpp/include" -I"/home/ubuntu/R/x86_64-pc-linux-gnu-library/3.6/beachmat/include"   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c sum_counts.cpp -o sum_counts.o
g++ -std=gnu++11 -I"/usr/share/R/include" -DNDEBUG -I../inst/include/ -I"/home/ubuntu/R/x86_64-pc-linux-gnu-library/3.6/Rcpp/include" -I"/home/ubuntu/R/x86_64-pc-linux-gnu-library/3.6/beachmat/include"   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c utils.cpp -o utils.o
g++ -std=gnu++11 -shared -L/usr/lib/R/lib -Wl,-Bsymbolic-functions -Wl,-z,relro -o scuttle.so RcppExports.o combined_qc.o downsample_counts.o fit_linear_model.o sum_counts.o utils.o -llapack -lblas -lgfortran -lm -lquadmath -L/usr/lib/R/lib -lR
installing to /home/ubuntu/R/x86_64-pc-linux-gnu-library/3.6/00LOCK-scuttle/00new/scuttle/libs
** R
** inst
** byte-compile and prepare package for lazy loading
Error: object ‘make_zero_col_DFrame’ is not exported by 'namespace:S4Vectors'
Execution halted
ERROR: lazy loading failed for package ‘scuttle’
* removing ‘/home/ubuntu/R/x86_64-pc-linux-gnu-library/3.6/scuttle’
Error: Failed to install 'scuttle' from GitHub:
  (converted from warning) installation of package ‘/tmp/RtmpceMH6Z/file242432a0e43f/scuttle_0.99.9.tar.gz’ had non-zero exit status
> 

Please advice how can I resolve the issue?

My R session is:

> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.6 LTS

Matrix products: default
BLAS:   /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] rstudioapi_0.11     magrittr_1.5        usethis_1.6.1       devtools_2.3.0      pkgload_1.1.0       R6_2.4.1           
 [7] rlang_0.4.6         fansi_0.4.1         tools_3.6.1         pkgbuild_1.0.8      sessioninfo_1.1.1   cli_2.0.2          
[13] withr_2.2.0         ellipsis_0.3.1      remotes_2.1.1       assertthat_0.2.1    digest_0.6.25       rprojroot_1.3-2    
[19] crayon_1.3.4        processx_3.4.2      BiocManager_1.30.10 callr_3.4.3         fs_1.4.1            ps_1.3.3           
[25] curl_4.3            testthat_2.3.2      memoise_1.1.0       glue_1.4.1          compiler_3.6.1      desc_1.2.0         
[31] backports_1.1.7     prettyunits_1.1.1  
> 

Choice of normalization strategy with multiple batches

Hi,

first of all I just wanted to say it's a pleasure to read your documentation, answers and even the code you write as they're always clear and full of opportunities for learning.

Now to the matter at hand: I have a quite large dataset of single nucleus RNA-seq from 8 individuals (8 separate 10X runs). These were prepped/sequenced in 2 different batches, but I am not interested in the inter-individual variability. In other words, I think it is safe to say that I can remove the batch effect (1 and 2) by removing the individual effect (1:8).

By reading on scran, batchelor and looking at your workflow on the Bioconductor OSCA website, however, I am still undecided as to what is the best strategy to normalize my data and wanted to ask for advice.

I have already conducted all the necessary count-level QC (capping low/high library sizes, capping % mitochondrial content, removing empty droplets, removing non-identified genes, etc). This was done on a merged count matrix so there is no subsetting problem.

Now for normalization, I reckon I have the following options:

  • scran pooling and deconvolution normalization on all individuals in the merged object, ignoring the individual: quickCluster, then computeSizeFactors, then logNormCounts. This appears to be what you did in the pancreas datasets in the OSCA tutorials (although I do not know whether in that case the different individuals were different 10X captures).

  • scran pooling and deconvolution as before, only this is done separately on each sample (i.e. subsetting each object by individual and running the normalization steps separately). It's very fast and easy to parallelize, which I don't dislike, and may make more sense in case clustering results are largely different for each individual. However, this may still introduce some bias as size factors may have different scales. I have to say I do not have large differences in coverage across batches/individuals so I do not expect this to be a big issue. Perhaps plotting the deconvolution size factors for each individual separately against the library size factors may shed some light on whether that's the case.

  • multiBatchNorm on the merged object, specifying the batch, then logNormCounts. This method should solve the size factor scaling issue, but it is unclear to me whether it sill uses the clustering + deconvolution approach (which I wanted to use given its success in some benchmarks) or whether it is a different method. Moreover, in the OSCA tutorials (chapter 13) the use of combinedVar is suggested for HVG selection, whereas I thought it would be sufficient to model the mean-variance with the blocked design.

I would then use fastMNN to remove any further batch effect that would still be present.

So, the question: what do you think is the most sensible approach? Is there something I'm missing?

Thanks for your time.

fastMNN error when supplying SCE with dimnames and non-NULL use.dimred

suppressPackageStartupMessages(library(batchelor))

B1 <- matrix(rnorm(10000), ncol = 40, dimnames = list(NULL, paste0("B1_", 1:40)))
B2 <- matrix(rnorm(10000), ncol = 40, dimnames = list(NULL, paste0("B2_", 1:40)))
batch <- c(rep(1, ncol(B1)), rep(2, ncol(B2)))
sce <- SingleCellExperiment(list(logcounts = cbind(B1, B2)))
assay(sce, "cosnormed") <- cosineNorm(logcounts(sce))
set.seed(666)
pcs <- multiBatchPCA(sce, batch = batch, assay.type = "cosnormed")
#> Warning in sweep(centered, 2, w, "/", check.margin = FALSE): 'check.margin' is ignored when 'x' is a DelayedArray object or
#>   derivative
reducedDim(sce, "PCA") <- do.call(rbind, pcs)

set.seed(666)
named_sce <- fastMNN(sce, batch = batch) # Works
#> Warning in sweep(centered, 2, w, "/", check.margin = FALSE): 'check.margin' is ignored when 'x' is a DelayedArray object or
#>   derivative
set.seed(666)
unnamed_sce <- fastMNN(unname(sce), batch = batch) # Works
#> Warning in sweep(centered, 2, w, "/", check.margin = FALSE): 'check.margin' is ignored when 'x' is a DelayedArray object or
#>   derivative
all.equal(reducedDim(named_sce), reducedDim(unnamed_sce), 
          check.attributes = FALSE) # Sanity check
#> [1] TRUE

unnamed_pca <- fastMNN(unname(sce), batch = batch, use.dimred = "PCA") # Works
all.equal(unnamed_pca$corrected, reducedDim(unnamed_sce)) # Sanity check
#> [1] TRUE
named_pca <- fastMNN(sce, batch = batch, use.dimred = "PCA") # Errors
#> Error in dimnames(x) <- dn: length of 'dimnames' [2] not equal to array extent

Created on 2019-05-22 by the reprex package (v0.2.1)

Session info
devtools::session_info()
#> ─ Session info ──────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 3.6.0 (2019-04-26)
#>  os       Ubuntu 18.04.2 LTS          
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language en_AU:en                    
#>  collate  en_AU.UTF-8                 
#>  ctype    en_AU.UTF-8                 
#>  tz       Australia/Melbourne         
#>  date     2019-05-22                  
#> 
#> ─ Packages ──────────────────────────────────────────────────────────────
#>  package              * version   date       lib source        
#>  assertthat             0.2.1     2019-03-21 [3] CRAN (R 3.5.3)
#>  backports              1.1.4     2019-04-10 [3] CRAN (R 3.5.3)
#>  batchelor            * 1.0.0     2019-05-02 [1] Bioconductor  
#>  beeswarm               0.2.3     2016-04-25 [1] CRAN (R 3.6.0)
#>  Biobase              * 2.44.0    2019-05-02 [1] Bioconductor  
#>  BiocGenerics         * 0.30.0    2019-05-02 [1] Bioconductor  
#>  BiocNeighbors          1.2.0     2019-05-02 [1] Bioconductor  
#>  BiocParallel         * 1.18.0    2019-05-03 [1] Bioconductor  
#>  BiocSingular           1.0.0     2019-05-02 [1] Bioconductor  
#>  bitops                 1.0-6     2013-08-17 [3] CRAN (R 3.5.0)
#>  callr                  3.2.0     2019-03-15 [3] CRAN (R 3.5.3)
#>  cli                    1.1.0     2019-03-19 [3] CRAN (R 3.5.3)
#>  colorspace             1.4-1     2019-03-18 [3] CRAN (R 3.5.3)
#>  crayon                 1.3.4     2017-09-16 [3] CRAN (R 3.5.0)
#>  DelayedArray         * 0.10.0    2019-05-02 [1] Bioconductor  
#>  DelayedMatrixStats     1.6.0     2019-05-02 [1] Bioconductor  
#>  desc                   1.2.0     2018-05-01 [3] CRAN (R 3.5.0)
#>  devtools               2.0.2     2019-04-08 [1] CRAN (R 3.6.0)
#>  digest                 0.6.18    2018-10-10 [3] CRAN (R 3.5.1)
#>  dplyr                  0.8.0.1   2019-02-15 [3] CRAN (R 3.5.2)
#>  evaluate               0.13      2019-02-12 [3] CRAN (R 3.5.2)
#>  fs                     1.3.1     2019-05-06 [3] CRAN (R 3.6.0)
#>  GenomeInfoDb         * 1.20.0    2019-05-02 [1] Bioconductor  
#>  GenomeInfoDbData       1.2.1     2019-05-07 [1] Bioconductor  
#>  GenomicRanges        * 1.36.0    2019-05-02 [1] Bioconductor  
#>  ggbeeswarm             0.6.0     2017-08-07 [1] CRAN (R 3.6.0)
#>  ggplot2                3.1.1     2019-04-07 [3] CRAN (R 3.5.3)
#>  glue                   1.3.1     2019-03-12 [3] CRAN (R 3.5.3)
#>  gridExtra              2.3       2017-09-09 [1] CRAN (R 3.6.0)
#>  gtable                 0.3.0     2019-03-25 [3] CRAN (R 3.5.3)
#>  highr                  0.8       2019-03-20 [3] CRAN (R 3.5.3)
#>  htmltools              0.3.6     2017-04-28 [3] CRAN (R 3.5.0)
#>  IRanges              * 2.18.0    2019-05-02 [1] Bioconductor  
#>  irlba                  2.3.3     2019-02-05 [1] CRAN (R 3.6.0)
#>  knitr                  1.22      2019-03-08 [3] CRAN (R 3.5.2)
#>  lattice                0.20-38   2018-11-04 [4] CRAN (R 3.5.1)
#>  lazyeval               0.2.2     2019-03-15 [3] CRAN (R 3.5.3)
#>  magrittr               1.5       2014-11-22 [3] CRAN (R 3.5.0)
#>  Matrix                 1.2-17    2019-03-22 [4] CRAN (R 3.5.3)
#>  matrixStats          * 0.54.0    2018-07-23 [1] CRAN (R 3.6.0)
#>  memoise                1.1.0     2017-04-21 [1] CRAN (R 3.6.0)
#>  munsell                0.5.0     2018-06-12 [3] CRAN (R 3.5.0)
#>  pillar                 1.3.1     2018-12-15 [3] CRAN (R 3.5.2)
#>  pkgbuild               1.0.3     2019-03-20 [3] CRAN (R 3.5.3)
#>  pkgconfig              2.0.2     2018-08-16 [3] CRAN (R 3.5.1)
#>  pkgload                1.0.2     2018-10-29 [3] CRAN (R 3.5.1)
#>  plyr                   1.8.4     2016-06-08 [3] CRAN (R 3.5.0)
#>  prettyunits            1.0.2     2015-07-13 [3] CRAN (R 3.5.0)
#>  processx               3.3.1     2019-05-08 [3] CRAN (R 3.6.0)
#>  ps                     1.3.0     2018-12-21 [3] CRAN (R 3.5.2)
#>  purrr                  0.3.2     2019-03-15 [3] CRAN (R 3.5.3)
#>  R6                     2.4.0     2019-02-14 [3] CRAN (R 3.5.2)
#>  Rcpp                   1.0.1     2019-03-17 [3] CRAN (R 3.5.3)
#>  RCurl                  1.95-4.12 2019-03-04 [3] CRAN (R 3.5.2)
#>  remotes                2.0.4     2019-04-10 [1] CRAN (R 3.6.0)
#>  rlang                  0.3.4     2019-04-07 [3] CRAN (R 3.5.3)
#>  rmarkdown              1.12      2019-03-14 [3] CRAN (R 3.5.3)
#>  rprojroot              1.3-2     2018-01-03 [3] CRAN (R 3.5.3)
#>  rsvd                   1.0.0     2018-11-06 [1] CRAN (R 3.6.0)
#>  S4Vectors            * 0.22.0    2019-05-02 [1] Bioconductor  
#>  scales                 1.0.0     2018-08-09 [3] CRAN (R 3.5.1)
#>  scater                 1.12.1    2019-05-15 [1] Bioconductor  
#>  sessioninfo            1.1.1     2018-11-05 [1] CRAN (R 3.6.0)
#>  SingleCellExperiment * 1.6.0     2019-05-02 [1] Bioconductor  
#>  stringi                1.4.3     2019-03-12 [3] CRAN (R 3.5.3)
#>  stringr                1.4.0     2019-02-10 [3] CRAN (R 3.5.2)
#>  SummarizedExperiment * 1.14.0    2019-05-02 [1] Bioconductor  
#>  testthat               2.1.1     2019-04-23 [1] CRAN (R 3.6.0)
#>  tibble                 2.1.1     2019-03-16 [3] CRAN (R 3.5.3)
#>  tidyselect             0.2.5     2018-10-11 [3] CRAN (R 3.5.1)
#>  usethis                1.5.0     2019-04-07 [1] CRAN (R 3.6.0)
#>  vipor                  0.4.5     2017-03-22 [1] CRAN (R 3.6.0)
#>  viridis                0.5.1     2018-03-29 [1] CRAN (R 3.6.0)
#>  viridisLite            0.3.0     2018-02-01 [3] CRAN (R 3.5.0)
#>  withr                  2.1.2     2018-03-15 [3] CRAN (R 3.5.0)
#>  xfun                   0.6       2019-04-02 [3] CRAN (R 3.5.3)
#>  XVector                0.24.0    2019-05-02 [1] Bioconductor  
#>  yaml                   2.2.0     2018-07-25 [3] CRAN (R 3.5.1)
#>  zlibbioc               1.30.0    2019-05-02 [1] Bioconductor  
#> 
#> [1] /home/peter/R/x86_64-pc-linux-gnu-library/3.6
#> [2] /usr/local/lib/R/site-library
#> [3] /usr/lib/R/site-library
#> [4] /usr/lib/R/library

The last line should work and all.equal(named_pca$corrected, reducedDim(named_sce) should be TRUE, right?
I got a bit lost in the traceback() trying to figure this one out.

Feature request: fastMNN on PCs

Hi,

Hopefully this is the right place for this.....
....I'm a big fan of fastMNN, but would like to be able to apply it directly to a set of reduced dimensions (PCs, LSI, etc) rather than a single-cell experiment object. I think this was an option in an earlier version of scran, but seems to have vanished more recently.

error in batchelor installation

if (!requireNamespace("BiocManager", quietly = TRUE))

  • install.packages("BiocManager")
    

BiocManager::install("batchelor")
'getOption("repos")' replaces Bioconductor standard repositories, see '?repositories' for details

replacement repositories:
CRAN: https://cran.rstudio.com/

Bioconductor version 3.14 (BiocManager 1.30.16), R 4.1.2 (2021-11-01)
Installing package(s) 'batchelor'
also installing the dependency ‘scuttle’

Packages which are only available in source form, and may need compilation of C/C++/Fortran: ‘scuttle’
‘batchelor’
Do you want to attempt to install these from sources? (Yes/no/cancel) yes
installing the source packages ‘scuttle’, ‘batchelor’

trying URL 'https://bioconductor.org/packages/3.14/bioc/src/contrib/scuttle_1.4.0.tar.gz'
Content type 'application/x-gzip' length 979954 bytes (956 KB)

downloaded 956 KB

trying URL 'https://bioconductor.org/packages/3.14/bioc/src/contrib/batchelor_1.10.0.tar.gz'
Content type 'application/x-gzip' length 1996990 bytes (1.9 MB)

downloaded 1.9 MB

  • installing source package ‘scuttle’ ...
    ** using staged installation
    ** libs
    clang++ -arch arm64 -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I../inst/include/ -I'/Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/Rcpp/include' -I'/Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include' -I/opt/R/arm64/include -fPIC -falign-functions=64 -Wall -g -O2 -c RcppExports.cpp -o RcppExports.o
    clang++ -arch arm64 -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I../inst/include/ -I'/Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/Rcpp/include' -I'/Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include' -I/opt/R/arm64/include -fPIC -falign-functions=64 -Wall -g -O2 -c cumulative_prop.cpp -o cumulative_prop.o
    In file included from cumulative_prop.cpp:2:
    In file included from /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/beachmat.h:24:
    In file included from /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/read_lin_block.h:11:
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:218:43: warning: 'beachmat::lin_sparse_matrix::get_row' hides overloaded virtual functions [-Woverloaded-virtual]
    virtual sparse_index<const int*, int> get_row(size_t r, int* work_x, int* work_i, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:66:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 5)
    virtual const int* get_row(size_t r, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:95:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 5)
    virtual const double* get_row(size_t r, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:236:43: warning: 'beachmat::lin_sparse_matrix::get_col' hides overloaded virtual functions [-Woverloaded-virtual]
    virtual sparse_index<const int*, int> get_col(size_t c, int* work_x, int* work_i, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:52:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 5)
    virtual const int* get_col(size_t c, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:81:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 5)
    virtual const double* get_col(size_t c, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:254:46: warning: 'beachmat::lin_sparse_matrix::get_row' hides overloaded virtual functions [-Woverloaded-virtual]
    virtual sparse_index<const double*, int> get_row(size_t r, double* work_x, int* work_i, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:66:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 5)
    virtual const int* get_row(size_t r, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:95:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 5)
    virtual const double* get_row(size_t r, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:272:46: warning: 'beachmat::lin_sparse_matrix::get_col' hides overloaded virtual functions [-Woverloaded-virtual]
    virtual sparse_index<const double*, int> get_col(size_t c, double* work_x, int* work_i, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:52:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 5)
    virtual const int* get_col(size_t c, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:81:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 5)
    virtual const double* get_col(size_t c, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:287:35: warning: 'beachmat::lin_sparse_matrix::get_col' hides overloaded virtual functions [-Woverloaded-virtual]
    sparse_index<const int*, int> get_col(size_t c, int* work_x, int* work_i) {
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:52:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 3)
    virtual const int* get_col(size_t c, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:81:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 3)
    virtual const double* get_col(size_t c, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:304:35: warning: 'beachmat::lin_sparse_matrix::get_row' hides overloaded virtual functions [-Woverloaded-virtual]
    sparse_index<const int*, int> get_row(size_t r, int* work_x, int* work_i) {
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:66:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 3)
    virtual const int* get_row(size_t r, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:95:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 3)
    virtual const double* get_row(size_t r, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:321:38: warning: 'beachmat::lin_sparse_matrix::get_col' hides overloaded virtual functions [-Woverloaded-virtual]
    sparse_index<const double*, int> get_col(size_t c, double* work_x, int* work_i) {
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:52:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 3)
    virtual const int* get_col(size_t c, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:81:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 3)
    virtual const double* get_col(size_t c, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:338:38: warning: 'beachmat::lin_sparse_matrix::get_row' hides overloaded virtual functions [-Woverloaded-virtual]
    sparse_index<const double*, int> get_row(size_t r, double* work_x, int* work_i) {
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:66:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 3)
    virtual const int* get_row(size_t r, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:95:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 3)
    virtual const double* get_row(size_t r, double* work, size_t first, size_t last) = 0;
    ^
    8 warnings generated.
    clang++ -arch arm64 -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I../inst/include/ -I'/Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/Rcpp/include' -I'/Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include' -I/opt/R/arm64/include -fPIC -falign-functions=64 -Wall -g -O2 -c downsample_counts.cpp -o downsample_counts.o
    In file included from downsample_counts.cpp:2:
    In file included from /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/beachmat.h:24:
    In file included from /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/read_lin_block.h:11:
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:218:43: warning: 'beachmat::lin_sparse_matrix::get_row' hides overloaded virtual functions [-Woverloaded-virtual]
    virtual sparse_index<const int*, int> get_row(size_t r, int* work_x, int* work_i, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:66:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 5)
    virtual const int* get_row(size_t r, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:95:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 5)
    virtual const double* get_row(size_t r, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:236:43: warning: 'beachmat::lin_sparse_matrix::get_col' hides overloaded virtual functions [-Woverloaded-virtual]
    virtual sparse_index<const int*, int> get_col(size_t c, int* work_x, int* work_i, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:52:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 5)
    virtual const int* get_col(size_t c, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:81:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 5)
    virtual const double* get_col(size_t c, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:254:46: warning: 'beachmat::lin_sparse_matrix::get_row' hides overloaded virtual functions [-Woverloaded-virtual]
    virtual sparse_index<const double*, int> get_row(size_t r, double* work_x, int* work_i, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:66:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 5)
    virtual const int* get_row(size_t r, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:95:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 5)
    virtual const double* get_row(size_t r, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:272:46: warning: 'beachmat::lin_sparse_matrix::get_col' hides overloaded virtual functions [-Woverloaded-virtual]
    virtual sparse_index<const double*, int> get_col(size_t c, double* work_x, int* work_i, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:52:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 5)
    virtual const int* get_col(size_t c, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:81:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 5)
    virtual const double* get_col(size_t c, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:287:35: warning: 'beachmat::lin_sparse_matrix::get_col' hides overloaded virtual functions [-Woverloaded-virtual]
    sparse_index<const int*, int> get_col(size_t c, int* work_x, int* work_i) {
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:52:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 3)
    virtual const int* get_col(size_t c, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:81:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 3)
    virtual const double* get_col(size_t c, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:304:35: warning: 'beachmat::lin_sparse_matrix::get_row' hides overloaded virtual functions [-Woverloaded-virtual]
    sparse_index<const int*, int> get_row(size_t r, int* work_x, int* work_i) {
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:66:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 3)
    virtual const int* get_row(size_t r, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:95:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 3)
    virtual const double* get_row(size_t r, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:321:38: warning: 'beachmat::lin_sparse_matrix::get_col' hides overloaded virtual functions [-Woverloaded-virtual]
    sparse_index<const double*, int> get_col(size_t c, double* work_x, int* work_i) {
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:52:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 3)
    virtual const int* get_col(size_t c, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:81:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 3)
    virtual const double* get_col(size_t c, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:338:38: warning: 'beachmat::lin_sparse_matrix::get_row' hides overloaded virtual functions [-Woverloaded-virtual]
    sparse_index<const double*, int> get_row(size_t r, double* work_x, int* work_i) {
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:66:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 3)
    virtual const int* get_row(size_t r, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:95:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 3)
    virtual const double* get_row(size_t r, double* work, size_t first, size_t last) = 0;
    ^
    8 warnings generated.
    clang++ -arch arm64 -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I../inst/include/ -I'/Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/Rcpp/include' -I'/Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include' -I/opt/R/arm64/include -fPIC -falign-functions=64 -Wall -g -O2 -c fit_linear_model.cpp -o fit_linear_model.o
    In file included from fit_linear_model.cpp:2:
    In file included from /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/beachmat.h:24:
    In file included from /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/read_lin_block.h:11:
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:218:43: warning: 'beachmat::lin_sparse_matrix::get_row' hides overloaded virtual functions [-Woverloaded-virtual]
    virtual sparse_index<const int*, int> get_row(size_t r, int* work_x, int* work_i, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:66:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 5)
    virtual const int* get_row(size_t r, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:95:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 5)
    virtual const double* get_row(size_t r, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:236:43: warning: 'beachmat::lin_sparse_matrix::get_col' hides overloaded virtual functions [-Woverloaded-virtual]
    virtual sparse_index<const int*, int> get_col(size_t c, int* work_x, int* work_i, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:52:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 5)
    virtual const int* get_col(size_t c, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:81:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 5)
    virtual const double* get_col(size_t c, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:254:46: warning: 'beachmat::lin_sparse_matrix::get_row' hides overloaded virtual functions [-Woverloaded-virtual]
    virtual sparse_index<const double*, int> get_row(size_t r, double* work_x, int* work_i, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:66:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 5)
    virtual const int* get_row(size_t r, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:95:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 5)
    virtual const double* get_row(size_t r, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:272:46: warning: 'beachmat::lin_sparse_matrix::get_col' hides overloaded virtual functions [-Woverloaded-virtual]
    virtual sparse_index<const double*, int> get_col(size_t c, double* work_x, int* work_i, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:52:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 5)
    virtual const int* get_col(size_t c, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:81:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 5)
    virtual const double* get_col(size_t c, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:287:35: warning: 'beachmat::lin_sparse_matrix::get_col' hides overloaded virtual functions [-Woverloaded-virtual]
    sparse_index<const int*, int> get_col(size_t c, int* work_x, int* work_i) {
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:52:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 3)
    virtual const int* get_col(size_t c, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:81:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 3)
    virtual const double* get_col(size_t c, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:304:35: warning: 'beachmat::lin_sparse_matrix::get_row' hides overloaded virtual functions [-Woverloaded-virtual]
    sparse_index<const int*, int> get_row(size_t r, int* work_x, int* work_i) {
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:66:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 3)
    virtual const int* get_row(size_t r, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:95:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 3)
    virtual const double* get_row(size_t r, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:321:38: warning: 'beachmat::lin_sparse_matrix::get_col' hides overloaded virtual functions [-Woverloaded-virtual]
    sparse_index<const double*, int> get_col(size_t c, double* work_x, int* work_i) {
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:52:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 3)
    virtual const int* get_col(size_t c, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:81:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 3)
    virtual const double* get_col(size_t c, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:338:38: warning: 'beachmat::lin_sparse_matrix::get_row' hides overloaded virtual functions [-Woverloaded-virtual]
    sparse_index<const double*, int> get_row(size_t r, double* work_x, int* work_i) {
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:66:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 3)
    virtual const int* get_row(size_t r, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:95:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 3)
    virtual const double* get_row(size_t r, double* work, size_t first, size_t last) = 0;
    ^
    8 warnings generated.
    clang++ -arch arm64 -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I../inst/include/ -I'/Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/Rcpp/include' -I'/Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include' -I/opt/R/arm64/include -fPIC -falign-functions=64 -Wall -g -O2 -c pool_size_factors.cpp -o pool_size_factors.o
    In file included from pool_size_factors.cpp:2:
    In file included from /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/beachmat.h:24:
    In file included from /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/read_lin_block.h:11:
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:218:43: warning: 'beachmat::lin_sparse_matrix::get_row' hides overloaded virtual functions [-Woverloaded-virtual]
    virtual sparse_index<const int*, int> get_row(size_t r, int* work_x, int* work_i, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:66:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 5)
    virtual const int* get_row(size_t r, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:95:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 5)
    virtual const double* get_row(size_t r, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:236:43: warning: 'beachmat::lin_sparse_matrix::get_col' hides overloaded virtual functions [-Woverloaded-virtual]
    virtual sparse_index<const int*, int> get_col(size_t c, int* work_x, int* work_i, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:52:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 5)
    virtual const int* get_col(size_t c, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:81:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 5)
    virtual const double* get_col(size_t c, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:254:46: warning: 'beachmat::lin_sparse_matrix::get_row' hides overloaded virtual functions [-Woverloaded-virtual]
    virtual sparse_index<const double*, int> get_row(size_t r, double* work_x, int* work_i, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:66:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 5)
    virtual const int* get_row(size_t r, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:95:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 5)
    virtual const double* get_row(size_t r, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:272:46: warning: 'beachmat::lin_sparse_matrix::get_col' hides overloaded virtual functions [-Woverloaded-virtual]
    virtual sparse_index<const double*, int> get_col(size_t c, double* work_x, int* work_i, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:52:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 5)
    virtual const int* get_col(size_t c, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:81:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 5)
    virtual const double* get_col(size_t c, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:287:35: warning: 'beachmat::lin_sparse_matrix::get_col' hides overloaded virtual functions [-Woverloaded-virtual]
    sparse_index<const int*, int> get_col(size_t c, int* work_x, int* work_i) {
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:52:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 3)
    virtual const int* get_col(size_t c, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:81:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 3)
    virtual const double* get_col(size_t c, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:304:35: warning: 'beachmat::lin_sparse_matrix::get_row' hides overloaded virtual functions [-Woverloaded-virtual]
    sparse_index<const int*, int> get_row(size_t r, int* work_x, int* work_i) {
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:66:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 3)
    virtual const int* get_row(size_t r, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:95:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 3)
    virtual const double* get_row(size_t r, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:321:38: warning: 'beachmat::lin_sparse_matrix::get_col' hides overloaded virtual functions [-Woverloaded-virtual]
    sparse_index<const double*, int> get_col(size_t c, double* work_x, int* work_i) {
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:52:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 3)
    virtual const int* get_col(size_t c, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:81:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 3)
    virtual const double* get_col(size_t c, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:338:38: warning: 'beachmat::lin_sparse_matrix::get_row' hides overloaded virtual functions [-Woverloaded-virtual]
    sparse_index<const double*, int> get_row(size_t r, double* work_x, int* work_i) {
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:66:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 3)
    virtual const int* get_row(size_t r, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:95:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 3)
    virtual const double* get_row(size_t r, double* work, size_t first, size_t last) = 0;
    ^
    8 warnings generated.
    clang++ -arch arm64 -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I../inst/include/ -I'/Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/Rcpp/include' -I'/Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include' -I/opt/R/arm64/include -fPIC -falign-functions=64 -Wall -g -O2 -c sum_counts.cpp -o sum_counts.o
    In file included from sum_counts.cpp:2:
    In file included from /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/beachmat.h:24:
    In file included from /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/read_lin_block.h:11:
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:218:43: warning: 'beachmat::lin_sparse_matrix::get_row' hides overloaded virtual functions [-Woverloaded-virtual]
    virtual sparse_index<const int*, int> get_row(size_t r, int* work_x, int* work_i, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:66:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 5)
    virtual const int* get_row(size_t r, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:95:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 5)
    virtual const double* get_row(size_t r, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:236:43: warning: 'beachmat::lin_sparse_matrix::get_col' hides overloaded virtual functions [-Woverloaded-virtual]
    virtual sparse_index<const int*, int> get_col(size_t c, int* work_x, int* work_i, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:52:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 5)
    virtual const int* get_col(size_t c, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:81:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 5)
    virtual const double* get_col(size_t c, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:254:46: warning: 'beachmat::lin_sparse_matrix::get_row' hides overloaded virtual functions [-Woverloaded-virtual]
    virtual sparse_index<const double*, int> get_row(size_t r, double* work_x, int* work_i, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:66:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 5)
    virtual const int* get_row(size_t r, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:95:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 5)
    virtual const double* get_row(size_t r, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:272:46: warning: 'beachmat::lin_sparse_matrix::get_col' hides overloaded virtual functions [-Woverloaded-virtual]
    virtual sparse_index<const double*, int> get_col(size_t c, double* work_x, int* work_i, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:52:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 5)
    virtual const int* get_col(size_t c, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:81:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 5)
    virtual const double* get_col(size_t c, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:287:35: warning: 'beachmat::lin_sparse_matrix::get_col' hides overloaded virtual functions [-Woverloaded-virtual]
    sparse_index<const int*, int> get_col(size_t c, int* work_x, int* work_i) {
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:52:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 3)
    virtual const int* get_col(size_t c, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:81:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 3)
    virtual const double* get_col(size_t c, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:304:35: warning: 'beachmat::lin_sparse_matrix::get_row' hides overloaded virtual functions [-Woverloaded-virtual]
    sparse_index<const int*, int> get_row(size_t r, int* work_x, int* work_i) {
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:66:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 3)
    virtual const int* get_row(size_t r, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:95:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 3)
    virtual const double* get_row(size_t r, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:321:38: warning: 'beachmat::lin_sparse_matrix::get_col' hides overloaded virtual functions [-Woverloaded-virtual]
    sparse_index<const double*, int> get_col(size_t c, double* work_x, int* work_i) {
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:52:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 3)
    virtual const int* get_col(size_t c, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:81:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_col' declared here: different number of parameters (4 vs 3)
    virtual const double* get_col(size_t c, double* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:338:38: warning: 'beachmat::lin_sparse_matrix::get_row' hides overloaded virtual functions [-Woverloaded-virtual]
    sparse_index<const double*, int> get_row(size_t r, double* work_x, int* work_i) {
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:66:24: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 3)
    virtual const int* get_row(size_t r, int* work, size_t first, size_t last) = 0;
    ^
    /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include/beachmat3/lin_matrix.h:95:27: note: hidden overloaded virtual function 'beachmat::lin_matrix::get_row' declared here: different number of parameters (4 vs 3)
    virtual const double* get_row(size_t r, double* work, size_t first, size_t last) = 0;
    ^
    8 warnings generated.
    clang++ -arch arm64 -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I../inst/include/ -I'/Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/Rcpp/include' -I'/Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/beachmat/include' -I/opt/R/arm64/include -fPIC -falign-functions=64 -Wall -g -O2 -c utils.cpp -o utils.o
    clang++ -arch arm64 -std=gnu++11 -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/Library/Frameworks/R.framework/Resources/lib -L/opt/R/arm64/lib -o scuttle.so RcppExports.o cumulative_prop.o downsample_counts.o fit_linear_model.o pool_size_factors.o sum_counts.o utils.o -L/Library/Frameworks/R.framework/Resources/lib -lRlapack -L/Library/Frameworks/R.framework/Resources/lib -lRblas -L/opt/R/arm64/gfortran/lib/gcc/aarch64-apple-darwin20.2.0/11.0.0 -L/opt/R/arm64/gfortran/lib -lgfortran -lemutls_w -lm -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation
    ld: warning: directory not found for option '-L/opt/R/arm64/gfortran/lib/gcc/aarch64-apple-darwin20.2.0/11.0.0'
    ld: warning: directory not found for option '-L/opt/R/arm64/gfortran/lib'
    ld: library not found for -lgfortran
    clang: error: linker command failed with exit code 1 (use -v to see invocation)
    make: *** [scuttle.so] Error 1
    ERROR: compilation failed for package ‘scuttle’
  • removing ‘/Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/scuttle’
    ERROR: dependency ‘scuttle’ is not available for package ‘batchelor’
  • removing ‘/Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/batchelor’

The downloaded source packages are in
‘/private/var/folders/mq/j68g448j3j59cct30c1wb9hh0000gp/T/Rtmpp9wICn/downloaded_packages’
Old packages: 'Matrix'
Update all/some/none? [a/s/n]:
n
Warning messages:
1: In .inet_warning(msg) :
unable to access index for repository https://bioconductor.org/packages/3.14/bioc/bin/macosx/big-sur-arm64/contrib/4.1:
cannot open URL 'https://bioconductor.org/packages/3.14/bioc/bin/macosx/big-sur-arm64/contrib/4.1/PACKAGES'
2: In .inet_warning(msg) :
unable to access index for repository https://bioconductor.org/packages/3.14/data/annotation/bin/macosx/big-sur-arm64/contrib/4.1:
cannot open URL 'https://bioconductor.org/packages/3.14/data/annotation/bin/macosx/big-sur-arm64/contrib/4.1/PACKAGES'
3: In .inet_warning(msg) :
unable to access index for repository https://bioconductor.org/packages/3.14/data/experiment/bin/macosx/big-sur-arm64/contrib/4.1:
cannot open URL 'https://bioconductor.org/packages/3.14/data/experiment/bin/macosx/big-sur-arm64/contrib/4.1/PACKAGES'
4: In .inet_warning(msg) :
unable to access index for repository https://bioconductor.org/packages/3.14/workflows/bin/macosx/big-sur-arm64/contrib/4.1:
cannot open URL 'https://bioconductor.org/packages/3.14/workflows/bin/macosx/big-sur-arm64/contrib/4.1/PACKAGES'
5: In .inet_warning(msg) :
unable to access index for repository https://bioconductor.org/packages/3.14/books/bin/macosx/big-sur-arm64/contrib/4.1:
cannot open URL 'https://bioconductor.org/packages/3.14/books/bin/macosx/big-sur-arm64/contrib/4.1/PACKAGES'
6: In .inet_warning(msg) :
installation of package ‘scuttle’ had non-zero exit status
7: In .inet_warning(msg) :
installation of package ‘batchelor’ had non-zero exit status

how to do fastMNN compute.variances

Dear author,
I have seen that there was a compute.variances parameter in the previous version of fastMNN in scran package. While I want to check out if there is any lost of biologically meaningful "batch effect" doing the fastMNN function, how can I achieve this purpose with current fastMNN missing the compute.variances parameter? Thanks.

In detail.
I have three samples of tissues from the same donor and find that there is one specific type of cells show obviously larger "batch effect" than others. The cells can't be clustered together without batch removal while they can be merged in one cluster with fastMNN. I wonder if there is any biologically meaningful "batch effect" lost during fastMNN function. I think maybe the compute.variances parameter can give me some tips about this.

Questions regarding multiBatchNorm and fastMNN based correction

I previously posted this on bio-conductor support, but later realized here would be better. Also as this is another question, so I made a separate post.

I am trying to use fastMNN approach to integrate multiple datasets. I went through the vignettes for scran and batchelor and tutorial on OSCA to understand how to pre-process my data and then perform fastMNN based correction. Also I followed the advice on this post #12

But I just wanted to be sure if I am getting this right: I should normalize using clustering based size factors individually (using scran), which would then be adjusted across batches by multiBatchNorm. So it will use the pre-computed normalized data to make adjustment? Also for marker analysis using findmarker and convert to edgeR, will it be using the adjusted normalized values or the raw counts? The reason is as Vieth et al suggested scran normalization with clustering is best way to get best DE estimates, I want to use the clustering based normalized data for the merged data to peform DE analysis.

Also after performing fastMNN, could I check how many of the corrected PCs are explaining the variance just like the elbow plot of normal PCA?

Thanks! Piyush

Add a quickCorrect function

That does the intersection, multiBatchNorm and combining of variances, prior to calling batchCorrect(). While calling them separately is great for pedagogical value, actually having to type them all out hurts my fingers and is a pain to read.

The same function should also be in charge of merging disparate SCE objects (via the hypothetical combineCols) and attaching the correction results onto that merged object.

correcting in multiple steps

Hi Aaron, thank you for developing and maintaining a great package!

I have time course data from 4 different time points, and each time point has multiple batches. There are biological differences between time points, but there are common cells too.

How would you recommend me to select HVGs, multibatchnorm and batch correct for such a dataset?

If I batch correct within each time point first then batch correct across time points, can I use MNN corrected matrix as an input for MNN correction?

Thank you so much for your help!

Problem with mixed DelayedMatrix & dgCMatrix data type

I have a list of SingleCellExperiment objects from 10X and non-10X experiments, and the counts and logcounts assays have a mixture of DelayedMatrix (10X) & dgCMatrix (non-10X) data types.

In such case I noticed fastMNN will go into some kind of never-ending loop and does not complete even after a long long time. However, fastMNN will run perfectly and very quickly If I specifically changed the logcounts assays from DelayedMatrix to dgCMatrix so that all the inputs have the dgCMatrix data type.

R version 4.0.3 (2020-10-10)
batchelor_1.6.3

multiBatchNorm: Error in .rescale_size_factors(stats$averages, stats$size.factors, min.mean = min.mean) : median ratio of averages between batches is not finite

Hi, when I use multiBatchNorm to do normalization and adjust for the sequencing depth, I tried to figure out it by changing the argument min.mean, but it cannot work. And it seems this problem has never been reported since I googled and searched this in github without any result.

When we encountered this error, does it mean multiBatchNorm is not fit for out data and we should do something else like cosineNorm function?

dimnames() differ

Hello,
I'm trying to use multiBatchNorm with two assays (B10 and B01).
I've checked multiple times for naming and number of rows to be equal but still get this error:

universe <- intersect(rownames(B10), rownames(B01))
rescaled <- multiBatchNorm(B10[universe,], B01[universe,])
Error in assays<-(*tmp*, ..., value = *vtmp*) :
current and replacement dimnames() differ

please advice

mnnCorrect.R

Hi I am new to data analysis and I am trying to do batch correction for three batches of single cell data [counts data obtained from alignment ] as part of my project and I tried using mnnCorrect as follows:

##Execute the below in R version 3.6.2
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("batchelor")

##I have counts from alignment of three batches
batch1<-read.csv("file1.csv", skip=1,row.names = 1,header=FALSE, sep='\t')
batch2<-read.csv("file2.csv", skip=1,row.names = 1, header=FALSE,sep='\t')
batch3<-read.csv("file3.csv", skip=1,row.names = 1, header=FALSE,sep='\t')

##and I modified the code below at only two places:
##1. mnnCorrect <- function(batch1,batch2,batch3, batch=NULL, restrict=3, k=20,..
##2. original <- batches <- .unpack_batches(batch1,batch2,batch3)

##----code modified as mentioned above and excited in R-----
mnnCorrect <- function(batch1,batch2,batch3, batch=NULL, restrict=3, k=20, prop.k=NULL, sigma=0.1,
cos.norm.in=TRUE, cos.norm.out=TRUE, svd.dim=0L, var.adj=TRUE,
subset.row=NULL, correct.all=FALSE, merge.order=NULL, auto.merge=FALSE,
assay.type="logcounts", BSPARAM=ExactParam(), BNPARAM=KmknnParam(), BPPARAM=SerialParam())
{
original <- batches <- .unpack_batches(batch1,batch2,batch3)
checkBatchConsistency(batches)
restrict <- checkRestrictions(batches, restrict)

Pulling out information from the SCE objects.

is.sce <- checkIfSCE(batches)
if (any(is.sce)) {
batches[is.sce] <- lapply(batches[is.sce], assay, i=assay.type, withDimnames=FALSE)
}

Subsetting by 'batch'.

do.split <- length(batches)==1L
if (do.split) {
divided <- divideIntoBatches(batches[[1]], batch=batch, restrict=restrict[[1]])
batches <- divided$batches
restrict <- divided$restrict
}

Setting up the parallelization environment.

if (.bpNotSharedOrUp(BPPARAM)) {
bpstart(BPPARAM)
on.exit(bpstop(BPPARAM), add=TRUE)
}

output <- do.call(.mnn_correct, c(batches,
list(k=k, prop.k=prop.k, sigma=sigma, cos.norm.in=cos.norm.in, cos.norm.out=cos.norm.out, svd.dim=svd.dim,
var.adj=var.adj, subset.row=subset.row, correct.all=correct.all, restrict=restrict,
merge.order=merge.order, auto.merge=auto.merge,
BSPARAM=BSPARAM, BNPARAM=BNPARAM, BPPARAM=BPPARAM)))

Reordering the output for correctness.

if (do.split) {
d.reo <- divided$reorder
output <- output[,d.reo,drop=FALSE]
metadata(output)$merge.info$pairs <- .reindex_pairings(metadata(output)$merge.info$pairs, d.reo)
}
.rename_output(output, original, subset.row=subset.row)
}

-------------------------end of execution

There was no output from the above code [though everything is executed and no error was thrown.
I tried to access 'output' object but I got [ Error: object 'output' not found]

Can you please help me understand where I am going wrong? or if my execution is wrong?

Please guide me to how I can obtain the batch corrected count reads. Thanks a lot for your help in advance.

semi-supervised fastMNN correction

(First, thanks Aaron for the development and maintenance of this awesome package!)

After reading this preprint, I was wondering if there would be the possibility for such a semi-supervised correction with fastMNN()?

For example filtering MNN pairs could be done based on the prior annotation of different batches, based on the labels inferred from a SingleR run, based on the matching clusters after a clusterMNN() run... What do you think?

The gene names are NAs after running mnnCorrect

Hello, developers
when using the mnnCorrect, I set the parameter subset.row as the 5000 HVGs and correct.all as TRUE for better determination of MNN pairs. However, the corrected singlecellexperiment object shows that there are NAs in the rownames( gene names). Looking deeply into the corrected matrix, I find that all the gene names are NAs except for the 5000 HVGs, while the corrected values exist.
I can't understand why the gene names are NAs after I set the subset.row as the 5000 HVGs and correct.all as TRUE. What should I do to fix it?

Best wishes

Below are my code and session info:
` # set the sce data and logNorm
luad_a_sce <- SingleCellExperiment(assay = list("counts" = as.matrix(luad_a@assays$Spatial@counts))) %>%
logNormCounts()

colnames(luad_a_sce) <- paste0(colnames(luad_a_sce), "_1")

luad_b_sce <- SingleCellExperiment(assay = list("counts" = as.matrix(luad_b@assays$Spatial@counts))) %>%
logNormCounts()

colnames(luad_b_sce) <- paste0(colnames(luad_b_sce), "_2")

luad_c_sce <- SingleCellExperiment(assay = list("counts" = as.matrix(luad_c@assays$Spatial@counts))) %>%
logNormCounts()

colnames(luad_c_sce) <- paste0(colnames(luad_c_sce), "_3")

#find the highly-variable genes for downstream MNN
dec_luad_a_sce <- modelGeneVar(luad_a_sce)
dec_luad_b_sce <- modelGeneVar(luad_b_sce)
dec_luad_c_sce <- modelGeneVar(luad_c_sce)

combined_dec_abc <- combineVar(dec_luad_a_sce, dec_luad_b_sce, dec_luad_c_sce)
HVGs_abc_sce <- getTopHVGs(combined_dec_abc, n=5000)

luad_abc_batch_corrected_sce_MNN <- mnnCorrect(luad_a_sce, luad_b_sce, luad_c_sce,
cos.norm.out = FALSE, subset.row = HVGs_abc_sce,
correct.all = TRUE, merge.order = c(1,3,2))`

`R version 4.0.5 (2021-03-31)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /public/home/maintain/miniconda3/lib/libopenblasp-r0.3.20.so

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base

other attached packages:
[1] forcats_0.5.1 stringr_1.4.1 dplyr_1.0.9 purrr_0.3.4
[5] readr_2.1.2 tidyr_1.2.0 tibble_3.1.7 tidyverse_1.3.2
[9] sp_1.5-0 SeuratObject_4.1.0 Seurat_4.1.1 scuttle_1.0.4
[13] scater_1.18.6 ggplot2_3.3.6 scran_1.18.7 batchelor_1.6.3
[17] SingleCellExperiment_1.12.0 SummarizedExperiment_1.20.0 Biobase_2.50.0 GenomicRanges_1.42.0
[21] GenomeInfoDb_1.26.7 IRanges_2.24.1 S4Vectors_0.28.1 BiocGenerics_0.43.4
[25] MatrixGenerics_1.2.1 matrixStats_0.62.0

loaded via a namespace (and not attached):
[1] utf8_1.2.2 reticulate_1.26 tidyselect_1.1.2 htmlwidgets_1.5.4
[5] grid_4.0.5 BiocParallel_1.24.1 Rtsne_0.16 munsell_0.5.0
[9] codetools_0.2-18 ica_1.0-2 statmod_1.4.36 future_1.28.0
[13] miniUI_0.1.1.1 withr_2.5.0 spatstat.random_2.2-0 colorspace_2.0-3
[17] progressr_0.10.1 rstudioapi_0.14 ROCR_1.0-11 tensor_1.5
[21] listenv_0.8.0 GenomeInfoDbData_1.2.4 polyclip_1.10-0 parallelly_1.32.1
[25] vctrs_0.4.1 generics_0.1.3 R6_2.5.1 ggbeeswarm_0.6.0
[29] rsvd_1.0.5 locfit_1.5-9.4 bitops_1.0-7 spatstat.utils_2.3-1
[33] DelayedArray_0.16.3 assertthat_0.2.1 promises_1.2.0.1 scales_1.2.1
[37] googlesheets4_1.0.1 rgeos_0.5-8 beeswarm_0.4.0 gtable_0.3.1
[41] beachmat_2.6.4 globals_0.16.1 goftest_1.2-3 rlang_1.0.2
[45] splines_4.0.5 lazyeval_0.2.2 gargle_1.2.1 spatstat.geom_2.4-0
[49] broom_0.8.0 reshape2_1.4.4 abind_1.4-5 modelr_0.1.9
[53] backports_1.4.1 httpuv_1.6.5 tools_4.0.5 ellipsis_0.3.2
[57] spatstat.core_2.4-4 RColorBrewer_1.1-3 ggridges_0.5.3 Rcpp_1.0.8.3
[61] plyr_1.8.7 sparseMatrixStats_1.2.1 zlibbioc_1.36.0 RCurl_1.98-1.6
[65] rpart_4.1.16 deldir_1.0-6 pbapply_1.5-0 viridis_0.6.2
[69] cowplot_1.1.1 zoo_1.8-10 haven_2.5.0 ggrepel_0.9.1
[73] cluster_2.1.3 fs_1.5.2 magrittr_2.0.3 data.table_1.14.2
[77] scattermore_0.8 ResidualMatrix_1.0.0 lmtest_0.9-40 reprex_2.0.2
[81] RANN_2.6.1 googledrive_2.0.0 fitdistrplus_1.1-8 hms_1.1.1
[85] patchwork_1.1.1 mime_0.12 xtable_1.8-4 readxl_1.4.0
[89] gridExtra_2.3 compiler_4.0.5 KernSmooth_2.23-20 crayon_1.5.2
[93] htmltools_0.5.2 mgcv_1.8-40 later_1.3.0 tzdb_0.3.0
[97] lubridate_1.8.0 DBI_1.1.3 dbplyr_2.2.1 MASS_7.3-57
[101] Matrix_1.4-1 cli_3.3.0 parallel_4.0.5 igraph_1.3.1
[105] pkgconfig_2.0.3 plotly_4.10.0 spatstat.sparse_2.1-1 xml2_1.3.3
[109] vipor_0.4.5 dqrng_0.3.0 XVector_0.30.0 rvest_1.0.3
[113] digest_0.6.29 sctransform_0.3.3 RcppAnnoy_0.0.19 spatstat.data_2.2-0
[117] cellranger_1.1.0 leiden_0.4.2 uwot_0.1.11 edgeR_3.32.1
[121] DelayedMatrixStats_1.12.3 shiny_1.7.3 lifecycle_1.0.1 nlme_3.1-157
[125] jsonlite_1.8.0 BiocNeighbors_1.8.2 viridisLite_0.4.1 limma_3.46.0
[129] fansi_1.0.3 pillar_1.8.1 lattice_0.20-45 fastmap_1.1.0
[133] httr_1.4.4 survival_3.3-1 glue_1.6.2 png_0.1-7
[137] bluster_1.0.0 stringi_1.7.6 BiocSingular_1.6.0 irlba_2.3.5
[141] future.apply_1.9.0 `

fastMNN not finishing

I am testing fastMNN() on our single-cell dataset to see how well it works at removing batch effects in our dataset. I got matrices from the groups I wanted to test using by extracting the specific cells by their batches from a Seurat object, making sure they were in the right sparse matrix format. I then ran it overnight, setting it to run with multiple cores.

However, it did not appear to work as it was still processing and did not appear to be using the multiple cores like I input. I used the same method of selecting multiple cores as I did with a previous runthrough of scran so I don't think it should have been that. I have also run Seurat CCA overnight on a less powerful computer and have been able to get results so I don't think processing power is the issue.

What could have gone wrong that it wasn't able to complete processing? Or is it actually that intensive of a function that it would require more than a day to finish running?

Error in RunFastMNN with Seurat wrapper

Hello,
I am using RunFastMNN which can be used directly with 'Seurat object'.
There were no problem with it but all of the sudden, I met this error below.
Would there be any suggestion to solve this??
Thank you!

im <- RunFastMNN(object.list = SplitObject(a, split.by = "batch"))
Computing 2000 integration features
Error in SummarizedExperiment::SummarizedExperiment(assays = assays) :
the rownames and colnames of the supplied assay(s) must be NULL or
identical to those of the SummarizedExperiment object (or derivative)
to construct

questions about fastMNN and mnnCorrect functions

Dear all,
I am asking about the functions fastMNN and mnnCorrect,
I give the functions like mnnCorrect or fastMNN the two gene expression matrices (genes in rows and cells in columns) but I always get
the same error : Error in checkBatchConsistency(batches) :
row names are not the same across batches
although I kept only the common genes in the two matrices.

Also, from the output variable, how to know which cells from both matrices match with each other?

Thanks
Ahmed

Correct more than two batches

Hello,
I am running fastMNN with data set that having 2 batches , 'dataset' and 'condition' .

I tried to correct both in once by specifying batch
fastMNN(combined,batch=c(combined$dataset, combined$condition), subset.row=chosen.hvgs)
but this does not work.

Is there a way to correct multiple batches?

Thank you :)

Error integrating with fastMNN when specifying merge order

Hi,

I'm trying to perform batch correction using fastMNN with a hierarchical merge order which integrates within each developmental stage before merging across stages. I've run the following below but I'm running into a subscript out of bounds error.

sce <- loadWagner2018()

gene_var <- modelGeneVar(sce)
hvgs <- getTopHVGs(gene_var,n=5000)
meta <- colData(sce)

order_df = meta[!duplicated(sce$library_id), c("stage", "library_id")]
order_df$ncells = sapply(order_df$library_id, function(x) sum(meta$library_id == x))
order_df$stage = factor(order_df$stage, 
                        levels = rev(c("24hpf", 
                                       "18hpf", 
                                       "14hpf", 
                                       "10hpf", 
                                       "8hpf", 
                                       "6hpf", 
                                       "4hpf")))
order_df$library_id <- as.factor(order_df$library_id)
order_df = order_df[order(order_df$stage, order_df$ncells, decreasing = TRUE),]

merge_order <- lapply(split(order_df,order_df$stage), function(x) list(x$library_id))
names(merge_order)<- NULL

order_df$stage = as.character(order_df$stage)


out <- fastMNN(sce,batch=sce$library_id,subset.row = hvgs,
               merge.order = rev(merge_order))

Error message:
Error: subscript contains out-of-bounds indices

After a bit of digging, it looks like the error is thrown while trying to reorder the MNN output at line 387.

# Reordering by the input order.        
    d.reo <- divided$reorder
    output <- output[d.reo,,drop=FALSE]

I was wondering if you could suggest what I might be doing wrong...?

The dataset has 63,530 cells and so d.reo is a permutation of these, however, the dimensions of output seems to be (29106, 2) which I was a bit confused by.

The merge.order list specified contains a nested list of the library IDs:

[[1]]
[[1]][[1]]
 [1] DEW106 DEW108 DEW105 DEW103 DEW102 DEW107 DEW109 DEW164 DEW101 DEW166 DEW104 DEW163 DEW110
[14] DEW165 DEW055 DEW162 DEW057 DEW056 DEW053 DEW054 DEW052 DEW168 DEW169 DEW021 DEW159 DEW158
[27] DEW167 DEW160 DEW161
54 Levels: DEW001 DEW003 DEW010 DEW011 DEW012 DEW021 DEW032 DEW033 DEW034 DEW035 ... DEW169


[[2]]
[[2]][[1]]
[1] DEW003 DEW038 DEW039 DEW041 DEW040 DEW012 DEW001
54 Levels: DEW001 DEW003 DEW010 DEW011 DEW012 DEW021 DEW032 DEW033 DEW034 DEW035 ... DEW169


[[3]]
[[3]][[1]]
[1] DEW035 DEW036 DEW037 DEW011
54 Levels: DEW001 DEW003 DEW010 DEW011 DEW012 DEW021 DEW032 DEW033 DEW034 DEW035 ... DEW169


[[4]]
[[4]][[1]]
[1] DEW033 DEW034 DEW010 DEW032
54 Levels: DEW001 DEW003 DEW010 DEW011 DEW012 DEW021 DEW032 DEW033 DEW034 DEW035 ... DEW169


[[5]]
[[5]][[1]]
[1] DEW048 DEW047 DEW049 DEW046
54 Levels: DEW001 DEW003 DEW010 DEW011 DEW012 DEW021 DEW032 DEW033 DEW034 DEW035 ... DEW169


[[6]]
[[6]][[1]]
[1] DEW043 DEW044 DEW045 DEW042
54 Levels: DEW001 DEW003 DEW010 DEW011 DEW012 DEW021 DEW032 DEW033 DEW034 DEW035 ... DEW169


[[7]]
[[7]][[1]]
[1] DEW050 DEW051
54 Levels: DEW001 DEW003 DEW010 DEW011 DEW012 DEW021 DEW032 DEW033 DEW034 DEW035 ... DEW169

Any hits would be appreciated!

Best,

Dan


Session info:

R version 4.0.4 (2021-02-15)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252   
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] batchelor_1.6.2             SingleCellExperiment_1.12.0 SummarizedExperiment_1.20.0
 [4] Biobase_2.50.0              GenomicRanges_1.42.0        GenomeInfoDb_1.26.4        
 [7] BiocNeighbors_1.8.2         BiocParallel_1.24.1         DelayedArray_0.16.2        
[10] IRanges_2.24.1              S4Vectors_0.28.1            MatrixGenerics_1.2.1       
[13] matrixStats_0.58.0          BiocGenerics_0.36.0         Matrix_1.3-2               
[16] BiocSingular_1.6.0         

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.6                BiocManager_1.30.10       compiler_4.0.4           
 [4] bluster_1.0.0             XVector_0.30.0            bitops_1.0-6             
 [7] tools_4.0.4               DelayedMatrixStats_1.12.3 zlibbioc_1.36.0          
[10] digest_0.6.27             statmod_1.4.35            evaluate_0.14            
[13] lattice_0.20-41           rlang_0.4.10              pkgconfig_2.0.3          
[16] igraph_1.2.6              yaml_2.2.1                xfun_0.21                
[19] GenomeInfoDbData_1.2.4    knitr_1.31                locfit_1.5-9.4           
[22] grid_4.0.4                scuttle_1.0.4             rmarkdown_2.7            
[25] limma_3.46.0              irlba_2.3.3               magrittr_2.0.1           
[28] edgeR_3.32.1              htmltools_0.5.1.1         sparseMatrixStats_1.2.1  
[31] beachmat_2.6.4            rsvd_1.0.3                dqrng_0.2.1              
[34] ResidualMatrix_1.0.0      RCurl_1.98-1.2            scran_1.18.5   

more singular values/vectors requested than available

Dear Developer,

I wanna ask about an error occurred when I run the fastMNN

out <- fastMNN(sce, batch = sce$Batch,

  •            auto.merge = TRUE,
    
  •            subset.row = rowData(sce)$use_channel,
    
  •            assay.type = "exprs")
    

Warning in (function (A, nv = 5, nu = nv, maxit = 1000, work = nv + 7, reorth = TRUE, :
You're computing too large a percentage of total singular values, use a standard svd instead.
Warning message:
In check_numbers(k = k, nu = nu, nv = nv, limit = min(dim(x)) - :
more singular values/vectors requested than available

Can you help me figure that out?
Thanks in advance!

Subsetting fastMNN integrated data

Hello,

I have integrated multiple datasets successfully using fastMNN. I am now needing to subset the data to focus on the analysis of specific clusters. My questions are as follows:

  1. Is it recommended that I rerun the fastMNN integration again on the subsetted data?
  2. Would it be appropriate to use subsetted corrected PCA dimensions done from the first round of integration for dimensional reduction?

In other discussion threads that have discussed integration such as Seurat's CCA, it is not recommended to rerun the integration if an integrated dataset is subsetted. However, Seurat conducts the correction in the gene expression space versus the PCA space like fastMNN so it is unclear to me what the best approach should be.

Any advice would be greatly appreciated!

Installation error for R 4.0.0. Redhat 7.4

I get the following error. I appreciate your help:

  • installing source package ‘BiocNeighbors’ ...
    ** using staged installation
    ** libs
    g++ -std=gnu++11 -I"/gpfs/share/apps/R/4.0.0/lib64/R/include" -DNDEBUG -I'/gpfs/share/apps/R/4.0.0/lib64/R/library/Rcpp/include' -I'/gpfs/share/apps/R/4.0.0/lib64/R/library/RcppAnnoy/include' -I'/gpfs/share/apps/R/4.0.0/lib64/R/library/RcppHNSW/include' -I/usr/local/include -fpic -g -O2 -c RcppExports.cpp -o RcppExports.o
    g++ -std=gnu++11 -I"/gpfs/share/apps/R/4.0.0/lib64/R/include" -DNDEBUG -I'/gpfs/share/apps/R/4.0.0/lib64/R/library/Rcpp/include' -I'/gpfs/share/apps/R/4.0.0/lib64/R/library/RcppAnnoy/include' -I'/gpfs/share/apps/R/4.0.0/lib64/R/library/RcppHNSW/include' -I/usr/local/include -fpic -g -O2 -c annoy.cpp -o annoy.o
    In file included from annoy.cpp:1:0:
    annoy.h:33:63: error: wrong number of template arguments (4, should be 5)
    typedef AnnoyIndex<Index_t, Data_t, Distance, Kiss64Random> _index;
    ^
    In file included from annoy.h:17:0,
    from annoy.cpp:1:
    /gpfs/share/apps/R/4.0.0/lib64/R/library/RcppAnnoy/include/annoylib.h:845:9: note: provided for ‘template<class S, class T, class Distance, class Random, class ThreadedBuildPolicy> class AnnoyIndex’
    class AnnoyIndex : public AnnoyIndexInterface<S, T,
    ^~~~~~~~~~
    annoy.cpp: In constructor ‘Annoy::Annoy(int, const string&, double)’:
    annoy.cpp:7:9: error: request for member ‘load’ in ‘((Annoy)this)->Annoy::obj’, which is of non-class type ‘Annoy::_index {aka int}’
    obj.load(fname.c_str());
    ^~~~
    annoy.cpp: In member function ‘MatDim_t Annoy::get_nobs() const’:
    annoy.cpp:16:16: error: request for member ‘get_n_items’ in ‘((const Annoy
    )this)->Annoy::obj’, which is of non-class type ‘const _index {aka const int}’
    return obj.get_n_items();
    ^~~~~~~~~~~
    annoy.cpp: In member function ‘void Annoy::find_nearest_neighbors(CellIndex_t, NumNeighbors_t, bool, bool)’:
    annoy.cpp:44:9: error: request for member ‘get_nns_by_item’ in ‘((Annoy)this)->Annoy::obj’, which is of non-class type ‘Annoy::_index {aka int}’
    obj.get_nns_by_item(c, K + 1, get_search_k(K + 1), &kept_idx, dptr); // +1, as it forgets to discard 'self'.
    ^~~~~~~~~~~~~~~
    annoy.cpp: In member function ‘void Annoy::find_nearest_neighbors(const double
    , NumNeighbors_t, bool, bool)’:
    annoy.cpp:86:9: error: request for member ‘get_nns_by_vector’ in ‘((Annoy*)this)->Annoy::obj’, which is of non-class type ‘Annoy::_index {aka int}’
    obj.get_nns_by_vector(holding.data(), K, get_search_k(K), &kept_idx, dptr);
    ^~~~~~~~~~~~~~~~~
    make: *** [annoy.o] Error 1
    ERROR: compilation failed for package ‘BiocNeighbors’
  • removing ‘/gpfs/share/apps/R/4.0.0/lib64/R/library/BiocNeighbors’
    ERROR: dependency ‘BiocNeighbors’ is not available for package ‘scater’
  • removing ‘/gpfs/share/apps/R/4.0.0/lib64/R/library/scater’
    ERROR: dependencies ‘BiocNeighbors’, ‘scater’ are not available for package ‘batchelor’
  • removing ‘/gpfs/share/apps/R/4.0.0/lib64/R/library/batchelor’

Unable to use correctExperiments with 'subset.row=' defined

Hi,

There is a problem using correctExperiments with subset.row= flag to subset assays to the defined genes. I am using the function on a list of SingleCellExperiments.

if (!is.null(subset.row) && !correct.all) {
raw.ass <- lapply(raw.ass, "[", i=subset.row, , drop=FALSE)
}

Error in eval(subscript, envir = eframe, enclos = eframe): '...' used in an incorrect context
Traceback:

1. lapply(raw.ass, "[", i = subset.row, , drop = FALSE)
2. lapply(raw.ass, "[", i = subset.row, , drop = FALSE)
3. FUN(X[[i]], ...)
4. FUN(X[[i]], ...)
5. extract_Nindex_from_syscall(sys.call(), parent.frame())
6. lapply(seq_len(length(call) - 2L), function(i) {
 .     subscript <- call[[2L + i]]
 .     if (missing(subscript)) 
 .         return(NULL)
 .     subscript <- eval(subscript, envir = eframe, enclos = eframe)
 .     if (is.null(subscript)) 
 .         return(integer(0))
 .     subscript
 . })
7. lapply(seq_len(length(call) - 2L), function(i) {
 .     subscript <- call[[2L + i]]
 .     if (missing(subscript)) 
 .         return(NULL)
 .     subscript <- eval(subscript, envir = eframe, enclos = eframe)
 .     if (is.null(subscript)) 
 .         return(integer(0))
 .     subscript
 . })
8. FUN(X[[i]], ...)
9. eval(subscript, envir = eframe, enclos = eframe)
10. eval(subscript, envir = eframe, enclos = eframe)
11. eval(subscript, envir = eframe, enclos = eframe)

I propose changed it as below and that should work.

        if (!is.null(subset.row) && !correct.all) {
            raw.ass <- lapply(raw.ass, function(x) x[subset.row,,drop=FALSE])
        }
R version 4.1.0 (2021-05-18)
batchelor_1.8.0

pre-processing for MNN

Hi Aaron,

I would like to use MNN for batch correction and have a few questions regarding the data pre-processing for MNN.
At the moment my steps are 1. Filter out empty droplets/low quality cells (for each sample), 2. merge samples in the same batch, 3. normalization within a batch, 4. get HVG, normalize across batches and run MNN.
(My experimental design in detail is described here (#21).)

  1. In step 2, which would you recommend, normalising within a batch and normalising within a sample?

  2. At which point should I filter out lowly expressed genes (calculateAverage(sce) > 0.1)? I am setting min.mean=0.1 for computeSumFactors and using top 5000 highly variable genes for multiBatchNorm/MNN. Do I still have to worry about filtering out genes before within-batch normalization or batch correction?

Thank you very much for your help.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.