saeyslab / multinichenetr Goto Github PK

MultiNicheNet: a flexible framework for differential cell-cell communication analysis from multi-sample multi-condition single-cell transcriptomics data

License: GNU General Public License v3.0

R 3.15% HTML 96.83% TeX 0.02%

cell-cell-communication differential-expression intercellular-communication intercellular-signaling ligand-receptor ligand-receptor-interaction ligand-target scrna-seq single-cell single-cell-rna-seq

multinichenetr's Introduction

multinichenetr

multinichenetr: the R package for differential cell-cell communication analysis from single-cell transcriptomics data with complex multi-sample, multi-condition designs. The goal of this toolbox is to study differences in intercellular communication between groups of samples of interest (eg patients of different disease states).

You can read all about MultiNicheNet in the following preprint: https://www.biorxiv.org/content/10.1101/2023.06.13.544751v1

The main goal of the MultiNicheNet package is to find which ligand-receptor interactions are differentially expressed and differentially active between conditions of interest, such as patient groups. Compared to the normal NicheNet workflow, MultiNicheNet is more suited to tackle complex experimental designs such as those with multiple samples and conditions, and multiple receiver cell types of interest.

In the MultiNicheNet approach, we allow the user to prioritize differential cell-cell communication events (ligand-receptor interactions and downstream signaling to target genes) based on the following criteria:

Upregulation of the ligand in a sender cell type and/or upregulation of the receptor in a receiver cell type - in the condition of interest.
Sufficiently high expression levels of ligand and receptor in many samples of the same group (to mitigate the influence of outlier samples).
Cell-type and condition specific expression of the ligand in the sender cell type and receptor in the receiver cell type (to mitigate the influence of upregulated but still relatively weakly expressed ligands/receptors)
High NicheNet ligand activity, to further prioritize ligand-receptor pairs based on their predicted effect of the ligand-receptor interaction on the gene expression in the receiver cell type

MultiNicheNet combines all these criteria in a single prioritization score, which is also comparable between all sender-receiver pairs. This way, MultiNicheNet extends on the prioritization done by NicheNet, which is only based on the ligand activity score.

Users can customize the weights of these different factors to prioritize some of these criteria stronger, or neglect them altogether.

At the basis of MultiNicheNet for defining differentially expressed ligands, receptors and target genes, is the the differential state analysis as discussed by muscat, which provides a framework for cell-level mixed models or methods based on aggregated “pseudobulk” data (https://doi.org/10.1038/s41467-020-19894-4, https://bioconductor.org/packages/release/bioc/html/muscat.html). We use this muscat framework to make inferences on the sample-level (as wanted in a multi-sample, multi-condition setting) and not the classic cell-level differential expression analysis of Seurat (Seurat::FindMarkers), because muscat allows us to overcome some of the limitations of cell-level analyses for differential state analyses. Some of these limitations include: a bias towards samples with more cells of cell type, a lack of flexibility to work with complex study designs, and a too optimistic estimation of the statistical power since the analysis is done at the cell-level and not at the sample level.

In the future, we might extend the differential expression analyses options to include other frameworks than muscat.

Main functionalities of multinichenetr

Prioritizing the most important ligand-receptor interactions from different sender-receiver pairs between different sample groups, according to criteria such as condition specificity, cell-type specificity, ligand activity (= downstream signaling activity), and more.
Finding differential expressed ligand-receptor interactions from different sender-receiver pairs between different sample groups (Differential Ligand-Receptor network inference).
Predicting the most active ligand-receptor interactions in different sample groups based on predicted signaling effects (NicheNet ligand activity analysis).
Predicting specific downstream affected target genes of ligand-receptor links of interest (NicheNet ligand-target inference).
Predicting intercellular signaling networks, connecting ligands to ligand- or receptor-encoding target genes in other cell types, enabling predictions concerning intercellular cascade and feedback mechanisms.

Installation of multinichenetr

Installation typically takes a few minutes, depending on the number of dependencies that has already been installed on your pc.

You can install multinichenetr (and required dependencies) from github with:

# install.packages("devtools")
devtools::install_github("saeyslab/nichenetr")
devtools::install_github("saeyslab/multinichenetr")

multinichenetr is tested via Github Actions version control on Windows, Linux (Ubuntu) and Mac (most recently tested R version: R 4.3.0.).

Learning to use multinichenetr

We provide several vignettes demonstrating the different types of analysis that can be performed with MultiNicheNet, and the several types of downstream visualizations that can be created.

We recommend users to start with the following vignette, which demonstrates the different steps in the analysis without too many details yet. This is the recommended vignette to learn the basics of MultiNicheNet.

MultiNicheNet analysis: MIS-C threewise comparison - step-by-step: vignette("basic_analysis_steps_MISC", package="multinichenetr")

This vignette provides an example of a comparison between 3 groups. The following vignettes demonstrate how to analyze cell-cell communication differences in other settings. For sake of simplicity, these vignettes also use a MultiNicheNet wrapper function, which encompasses the different steps demonstrated in the previous vignette. These vignettes are the best vignettes to learn how to apply MultiNicheNet to different datastes for addressing different questions.

MultiNicheNet analysis: MIS-C pairwise comparison - wrapper function: vignette("pairwise_analysis_MISC.Rmd", package="multinichenetr")
MultiNicheNet analysis: MIS-C threewise comparison - wrapper function: vignette("threewise_analysis_MISC", package="multinichenetr")
MultiNicheNet analysis: SCC paired analysis - wrapper function: vignette("paired_analysis_SCC", package="multinichenetr")
MultiNicheNet analysis: anti-PD1 Breast cancer multifactorial comparison - wrapper function: vignette("multifactorial_analysis_BreastCancer", package="multinichenetr")
MultiNicheNet analysis: Integrated lung atlas analysis - correct for batch effects to infer differences between IPF and healthy subjects - wrapper function: vignette("batch_correction_analysis_LungAtlas", package="multinichenetr")

The next vignette will cover the different steps in more detail, showcasing some additional recommended quality checks and visualizations

MultiNicheNet analysis: MIS-C threewise comparison - step-by-step with all details: vignette("detailed_analysis_steps_MISC", package="multinichenetr")

That vignettes checks as well for the DE analysis p-value distributions. In case these are suboptimal, pointing to violations to some model assumptions, we recommend to use empirical p-values as discussed in the Methods section of the paper and demonstrated in the following vignette:

MultiNicheNet analysis: anti-PD1 Breast cancer multifactorial comparison - step-by-step with all details: vignette("detailed_analysis_steps_empirical_pvalues", package="multinichenetr")

When applying MultiNicheNet on datasets with many samples and cell types, it is recommended to run the analysis on HPC infrastructure.You can have a look at following scripts to see how we split up the analysis in two parts: 1) running MultiNicheNet and saving necessary output and plots; and 2) interpreting the results and generating visualizations.

Frequently recurring questions and issues

Even though it is stated in the vignettes, many reported issues arise because names of celltypes, groups/conditions, and/or samples are not syntactically valid. Before reporting your issue, make sure you satisfy this condition and other conditions described in the vignettes. In the latest version of MultiNicheNet, input checks are run to check this and give an understandable error message.
It is required that each sample is uniquely assigned to only one condition/group of interest. See the vignettes about paired and multifactorial analysis to see how to define your analysis input when you have multiple samples and conditions per patient. In the latest version of MultiNicheNet, input checks are run to check this and give an understandable error message.
We strongly recommend having at least 4 samples in each of the groups/conditions you want to compare. With less samples, the benefits of performing a pseudobulk-based DE analysis are less clear and non-multi-sample tools for differential cell-cell communication might be better alternatives.

References

Browaeys, R. et al. MultiNicheNet: a flexible framework for differential cell-cell communication analysis from multi-sample multi-condition single-cell transcriptomics data. (preprint)

Crowell, H.L., Soneson, C., Germain, PL. et al. muscat detects subpopulation-specific state transitions from multi-sample multi-condition single-cell transcriptomics data. Nat Commun 11, 6077 (2020). https://doi.org/10.1038/s41467-020-19894-4

Browaeys, R., Saelens, W. & Saeys, Y. NicheNet: modeling intercellular communication by linking ligands to target genes. Nat Methods (2019) doi:10.1038/s41592-019-0667-5

multinichenetr's People

Contributors

Stargazers

Watchers

Forkers

duojchen multitalk emiliepollenus csangara kerwin12580 yooooopick ewowiredu koetjen colinmccornack nbahti sukses24 hasihays onerai

multinichenetr's Issues

supporting evidence for the regulate potential of a ligand on a target

Dear multinichenetr team,
Thanks for developing such a wonderful tool! I have ran multinichenetr and identified a very interesting ligand for the downregulation of my target gene.
I tried to find evidence that supports the downregulation of this ligand on my target using google search and pubmed, but could not find it.
The ligand of my interest is CTHRC1 and running multinichenetr on my dataset it shows it could downregulate CLDN7. I am eager to find evidence that underpins this regulation.
Could you first elaborate on how the regulate potential of a ligand on target genes is predicted? Also could you suggest ways in helping me find the relevant references or datasets for me to gain confidence on this regulation?

Thanks very much, and looking forward to your reply.
Best,
Zhenzhen

get_abundance_expression_info changes category names that causes downstream error

head(sce$cellType)
#'CD4 Activated Memory T Cells''CD8 Cytotoxic T Cells''CD4 Memory T Cells''CD8 Cytotoxic T Cells''NK Cells''MAIT Cytotoxic T Cells'

senders_oi = c("Plasma Cells","Macrophages","CD1C DCs")
receivers_oi = c("Plasma Cells","CD4 Cytotoxic T Cells")

abundance_expression_info = get_abundance_expression_info(sce = sce_cntr, sample_id = sample_id, group_id = group_id
, celltype_id = celltype_id, min_cells = min_cells, senders_oi = senders_oi
,receivers_oi = receivers_oi, lr_network = lr_network, batches = batches)

I had been getting errors in the downstream analysis so I started debugging the steps. I went deep into the get_abundance_expression_info function and realized that the information gathered in the celltype_info object is different than the input data. The spaces were filled with dots.

celltype_info$rel_abundance_df
#A tibble: 8 × 3
#group celltype rel_abundance_scaled
#
#MM CD1C.DCs 0.00100000
#SMM CD1C.DCs 1.00100000
#MM CD4.Cytotoxic.T.Cells 0.67023857
#SMM CD4.Cytotoxic.T.Cells 0.33176143
.
.
.

Such difference was resulting with empty data frames in the code below.

rel_abundance_df_sender = sender_info$rel_abundance_df %>% dplyr::filter(sender %in% senders_oi)
rel_abundance_df_receiver = receiver_info$rel_abundance_df %>% dplyr::filter(receiver %in% receivers_oi)
rel_abundance_df_sender_receiver = rel_abundance_df_sender %>% dplyr::inner_join(rel_abundance_df_receiver, by =
"group") %>% dplyr::mutate(sender_receiver_rel_abundance_avg = 0.5*(rel_abundance_scaled_sender +
rel_abundance_scaled_receiver))

Which in turn also results an empty data frame for the code below, effecting errors in the following steps of the analysis.
abundance_expression_info$sender_receiver_info$rel_abundance_df
#A tibble: 0 × 6
#group sender rel_abundance_scaled_sender receiver rel_abundance_scaled_receiver #sender_receiver_rel_abundance_avg
#

Fig 2b in preprint

Hi, thanks for the nice tool.

I found the layout of Fig 2b to be very clear and readable. In comparison, when using the make_ggraph_ligand_target_links() function to generate the plot, the output doesn't quite match the clarity of the preprint. I noticed that Fig 2b was created using Cytoscape, and I was wondering if you could kindly share the code needed to prepare the files for Cytoscape and any details regarding the layout used in Cytoscape.

Thank you so much for considering this request.

Jason

NAs in "target" and "ligand_target_weighted" columns `ligand_activities_targets_DEgenes$ligand_activities` dataframe

          Thank you!

I wanted to make sure this was OK since at Step 3 of the introductory vignette (https://github.com/saeyslab/multinichenetr/blob/main/vignettes/basic_analysis_steps_MISC.md) I am getting a ligand_activities_targets_DEgenes$ligand_activities dataframe with rows with NA values in the "target" and "ligand_target_weighted" columns. Specially if I use adjusted p-value instead of normal p-value, which happens in nearly half of the rows. Is this because of gene names missing from the ligand_target_matrix? I am using the mouse Nichenet v2 networks.

Thank you again.

Originally posted by @SergioRodLla in #7 (comment)

Error with cell type names containing spaces

I ran an analysis set up to view "T cells" as a receiver against all other cells as senders in my single nuclei RNA-seq data, and when I try to view prioritization tables of top interactions for T cells as senders it generates the following message:

prioritized_tbl_oi_PostT_50 = get_top_n_lr_pairs(multinichenet_output$prioritization_tables, 50, groups_oi = group_oi, senders_oi = "T cell")

plot_oi = make_sample_lr_prod_activity_plots(multinichenet_output$prioritization_tables, prioritized_tbl_oi_PostT_50)
Warning messages:
1: In max(.) : no non-missing arguments to max; returning -Inf
2: In max(., na.rm = TRUE) :
no non-missing arguments to max; returning -Inf
3: In max(.) : no non-missing arguments to max; returning -Inf
4: In min(.) : no non-missing arguments to min; returning Inf

After renaming all of my annotated cells such as "T cells", "B cells", "Dendritic cells", etc. to nomenclature without spaces ("Tcell", "DC", etc.) I had no issues viewing the prioritization tables or other data points.

Unsure if this is a bug or not, but after some trial and error this fixed my issues.

Error in `auto_copy()`: Set `copy = TRUE` if `y` can be copied to the same source as `x` (may be slow).

Hi,

when running MNN in step-by-step mode, I get the following error when calling get_DE_info:

Error in `auto_copy()`:
! `x` and `y` must share the same src.
ℹ `x` is a <tbl_df/tbl/data.frame> object.
ℹ `y` is `NULL`.
ℹ Set `copy = TRUE` if `y` can be copied to the same source as `x` (may be slow).
Run `rlang::last_trace()` to see where the error occurred.
> rlang::last_trace()
<error/rlang_error>
Error in `auto_copy()`:
! `x` and `y` must share the same src.
ℹ `x` is a <tbl_df/tbl/data.frame> object.
ℹ `y` is `NULL`.
ℹ Set `copy = TRUE` if `y` can be copied to the same source as `x` (may be slow).
---
Backtrace:
    ▆
 1. ├─multinichenetr::get_DE_info(...)
 2. │ └─... %>% ...
 3. ├─dplyr::select(., gene, cluster_id, logFC, p_val, p_adj, contrast)
 4. ├─dplyr::inner_join(., contrast_tbl, by = "group")
 5. └─dplyr:::inner_join.data.frame(., contrast_tbl, by = "group")
 6.   └─dplyr::auto_copy(x, y, copy = copy)
Run rlang::last_trace(drop = FALSE) to see 1 hidden frame.
> rlang::last_trace(drop = FALSE)
<error/rlang_error>

It appears to be the last part of the following which is causing the issue:

celltype_de_findmarkers = celltypes %>% lapply(function(celltype_oi, sce){
      genes_expressed = rownames(sce) ## change later if necessary for having a more decent filtering
      sce_oi = sce[intersect(rownames(sce), genes_expressed), SummarizedExperiment::colData(sce)[,celltype_id] == celltype_oi]
      DE_tables_list = scran::findMarkers(sce_oi, test.type="t", groups = SummarizedExperiment::colData(sce_oi)[,group_id])
      conditions = names(DE_tables_list)
      DE_tables_df = conditions %>% lapply(function(condition_oi, DE_tables_list){
        DE_table_oi = DE_tables_list[[condition_oi]]
        DE_table_oi = DE_table_oi %>% data.frame() %>% tibble::rownames_to_column("gene") %>% tibble::as_tibble() %>% dplyr::mutate(cluster_id = celltype_oi, group = condition_oi) %>% dplyr::select(gene, p.value, FDR, summary.logFC, cluster_id, group)  
      }, DE_tables_list) %>% dplyr::bind_rows()
    }, **sce) %>% dplyr::bind_rows() %>% dplyr::rename(logFC = summary.logFC, p_val = p.value, p_adj = FDR) %>% dplyr::inner_join(contrast_tbl, by = "group") %>% dplyr::select(gene, cluster_id, logFC, p_val, p_adj, contrast**)

This is what contrast_tbl looks like (I believe it is set-up fine):

# A tibble: 3 × 2
  contrast                                                   group               
  <chr>                                                      <chr>               
1 Parotid.Gland-(Minor.Salivary.Gland+Submandibular.Gland)/2 Parotid.Gland       
2 Minor.Salivary.Gland-(Parotid.Gland+Submandibular.Gland)/2 Minor.Salivary.Gland
3 Submandibular.Gland-(Minor.Salivary.Gland+Parotid.Gland)/2 Submandibular.Gland

Have you any ideas what the problem might be?

Thanks,
Catherine

Error in dplyr::mutate() during "Combine all the information in prioritization tables"

Hi multinichenet team!

I am facing the following error when trying to run multi_nichenet_analysis() on my dataset.

The output I get:
[1] "Calculate differential expression for all cell types"
[1] "DE analysis is done:"
[1] "included cell types are:"
[1] "IEL" "Early.enterocytes" "Intermediate.enterocytes"
[4] "TA.cells" "Enteroendocrine.cells" "BEST4.cells"
[7] "Mature.enterocytes" "Tuft.cells" "Goblet.NEUROG3..progenitor"
[10] "Stem.cells" "Paneth.cells"
[1] "Make diagnostic abundance plots + Calculate expression information"
[1] "Calculate NicheNet ligand activities and ligand-target links"
[1] "Combine all the information in prioritization tables"
Error in dplyr::mutate():
ℹ In argument: scaled_LR_prod = nichenetr::scaling_zscore(ligand_receptor_prod).
Caused by error in if (sd(x, na.rm = TRUE) > 0) ...:
! missing value where TRUE/FALSE needed

`Backtrace:
▆

├─multinichenetr::multi_nichenet_analysis(...)
│ └─multinichenetr::multi_nichenet_analysis_combined(...)
│ ├─base::suppressMessages(...)
│ │ └─base::withCallingHandlers(...)
│ └─multinichenetr::generate_prioritization_tables(...)
│ └─... %>% dplyr::ungroup()
├─dplyr::ungroup(.)
├─dplyr::mutate(...)
├─dplyr:::mutate.data.frame(...)
│ └─dplyr:::mutate_cols(.data, dplyr_quosures(...), by)
│ ├─base::withCallingHandlers(...)
│ └─dplyr:::mutate_col(dots[[i]], data, mask, new_columns)
│ └─mask$eval_all_mutate(quo)
│ └─dplyr (local) eval()
└─nichenetr::scaling_zscore(ligand_receptor_prod)`

Any ideas on what could be the cause of this issue and how to solve it?
Thank you!

Error when creating contrasts_oi_simplified when running multi_nichenet_analysis_combined

Hello,

I am following this tutorial:

https://github.com/saeyslab/multinichenetr/blob/main/vignettes/threewise_analysis_MISC.md

But I get an error when I run the following function in step 2:

(Note: I have not made any changes to this)

multinichenet_output = multi_nichenet_analysis(sce = sce, celltype_id = celltype_id, sample_id = sample_id, group_id = group_id, lr_network = lr_network, ligand_target_matrix = ligand_target_matrix, contrasts_oi = contrasts_oi, contrast_tbl = contrast_tbl, batches = batches, covariates = covariates, prioritizing_weights = prioritizing_weights, min_cells = min_cells, logFC_threshold = logFC_threshold, p_val_threshold = p_val_threshold, fraction_cutoff = fraction_cutoff, p_val_adj = p_val_adj, empirical_pval = empirical_pval, top_n_target = top_n_target, n.cores = n.cores, sender_receiver_separate = FALSE, verbose = TRUE)

I got the following error:

Error in multi_nichenet_analysis_combined(...) :
conditions written in contrasts_oi should be in the contrast column of contrast_tbl column! This is not the case, which can lead to errors downstream.

I looked in the code for multi_nichenet_analysis_combined, and the error occurs when creating the contrasts_oi_simplified

contrasts_oi_simplified = stringr::str_split(contrasts_oi, "'") %>% unlist() %>% unique() %>% stringr::str_split(",") %>% unlist() %>% unique() %>% generics::setdiff(c("", ",")) %>% unlist() %>% unique()

This creates a vector that looks like this:
"A-B" " " "C-B", the " " is not removed.

I added this to the generics::setdiff(c("", ",")) so it becomes generics::setdiff(c("", ",", " "))

The full code:
contrasts_oi_simplified = stringr::str_split(contrasts_oi, "'") %>% unlist() %>% unique() %>% stringr::str_split(",") %>% unlist() %>% unique() %>% generics::setdiff(c("", ",", " ")) %>% unlist() %>% unique()

and this creates a vector that looks like this:
"A-B" "C-B"

And I am now able to run the command.

This is how I created the contrasts_oi and contrast_tbl

`contrasts_oi = c("'A-B', 'C-B'")

contrast_tbl = tibble(contrast = c('A-B', 'C-B'),
group = c("A", "C"))
`
Is there something I am doing wrong when creating the contrast_oi and contrast_tbl?

Kind Regards,
Isabell

Session info:

`R version 4.3.0 (2023-04-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.1 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

time zone: Europe/Berlin
tzcode source: system (glibc)

attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base

other attached packages:
[1] SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0 Biobase_2.62.0 GenomicRanges_1.54.1 GenomeInfoDb_1.38.1
[6] IRanges_2.36.0 S4Vectors_0.40.2 BiocGenerics_0.48.1 MatrixGenerics_1.14.0 matrixStats_1.2.0
[11] ggplot2_3.4.4 dplyr_1.1.4 Signac_1.12.0 SeuratObject_5.0.1 Seurat_4.4.0
[16] multinichenetr_1.0.3

loaded via a [1] progress_1.2.3 [6] vctrs_0.6.5 [11] shape_1.4.6 [16] MASS_7.3-58.4 [21] xfun_0.41 [26] ggbeeswarm_0.7.2 [31] pbapply_1.7-2 [36] httr_1.4.7 [41] miniUI_0.1.1.1 [46] ggraph_2.1.0 [51] xtable_1.8-4 [56] hms_1.1.3 [61] reticulate_1.34.0 [66] later_1.3.2 [71] future.apply_1.11.0 [76] cowplot_1.1.1 [81] nlme_3.1-162 [86] stringi_1.8.3 [91] plyr_1.8.9 [96] locfit_1.5-9.8 [101] fastmatch_1.1-4 [106] GetoptLong_1.0.5 [111] circlize_0.4.15 [116] utf8_1.2.4 [121] checkmate_2.3.1 [126] tibble_3.2.1 [131] tweenr_2.0.2 [136] RhpcBLASctl_0.23-42 [141] fastmap_1.1.1 [146] Rsamtools_2.16.0 [151] dotCall64_1.1-1 [156] aod_1.3.2 [161] cli_3.6.2 [166] uwot_0.1.16 [171] backports_1.4.1 [176] rjson_0.2.21 [181] limma_3.58.1 [186] Rtsne_0.17 [191] dqrng_0.3.2 [196] htmltools_0.5.7 [201] XVector_0.42.0 [206] EnvStats_2.8.1 [211] R6_2.5.1 [216] fdrtool_1.2.17 [221] DelayedArray_0.28.0 [226] car_3.1-2 [231] munsell_0.5.0 [236] RColorBrewer_1.1-3 [241] ggnewscale_0.4.9 namespace (and not attached):
nnet_7.3-18 locfdr_1.1-8 goftest_1.2-3 Biostrings_2.70.1
spatstat.random_3.2-2 digest_0.6.33 png_0.1-8 corpcor_1.6.10
proxy_0.4-27 ggrepel_0.9.4 deldir_2.0-2 parallelly_1.36.0
reshape2_1.4.4 httpuv_1.6.13 foreach_1.5.2 withr_2.5.2
ggpubr_0.6.0 ellipsis_0.3.2 survival_3.5-5 memoise_2.0.1
emmeans_1.8.9 zoo_1.8-12 GlobalOptions_0.1.2 gtools_3.9.5
Formula_1.2-5 prettyunits_1.2.0 KEGGREST_1.42.0 promises_1.2.1
rstatix_0.7.2 globals_0.16.2 fitdistrplus_1.1-11 rstudioapi_0.14
generics_0.1.3 base64enc_0.1-3 zlibbioc_1.48.0 ScaledMatrix_1.10.0
polyclip_1.10-6 randomForest_4.7-1.1 GenomeInfoDbData_1.2.11 SparseArray_1.2.2
stringr_1.5.1 doParallel_1.0.17 evaluate_0.23 S4Arrays_1.2.0
irlba_2.3.5.1 colorspace_2.1-0 visNetwork_2.1.2 ROCR_1.0-11
spatstat.data_3.0-3 magrittr_2.0.3 lmtest_0.9-40 readr_2.1.4
viridis_0.6.4 lattice_0.21-8 genefilter_1.84.0 spatstat.geom_3.2-7
XML_3.99-0.16 scattermore_1.2 scuttle_1.12.0 shadowtext_0.1.2
RcppAnnoy_0.0.21 class_7.3-21 Hmisc_5.1-1 pillar_1.9.0
iterators_1.0.14 caTools_1.18.2 compiler_4.3.0 beachmat_2.18.0
gower_1.0.1 tensor_1.5 minqa_1.2.6 lubridate_1.9.3
crayon_1.5.2 abind_1.4-5 scater_1.30.1 blme_1.0-5
sp_2.1-2 bit_4.0.5 graphlayouts_1.0.2 UpSetR_1.4.0
codetools_0.2-19 recipes_1.0.8 BiocSingular_1.18.0 e1071_1.7-14
plotly_4.10.3 remaCor_0.0.16 mime_0.12 splines_4.3.0
Rcpp_1.0.11 sparseMatrixStats_1.14.0 blob_1.2.4 knitr_1.45
here_1.0.1 clue_0.3-65 lme4_1.1-35.1 listenv_0.9.0
DelayedMatrixStats_1.24.0 Rdpack_2.6 ggsignif_0.6.4 estimability_1.4.1
Matrix_1.6-4 statmod_1.5.0 tzdb_0.4.0 fANCOVA_0.6-1
pkgconfig_2.0.3 tools_4.3.0 cachem_1.0.8 RSQLite_2.3.4
rbibutils_2.2.16 DBI_1.1.3 viridisLite_0.4.2 numDeriv_2016.8-1.1
rmarkdown_2.25 scales_1.3.0 grid_4.3.0 ica_1.0-3
nichenetr_2.0.4 broom_1.0.5 patchwork_1.1.3 coda_0.19-4
carData_3.0-5 RANN_2.6.1 rpart_4.1.19 farver_2.1.1
tidygraph_1.2.3 mgcv_1.8-42 DiagrammeR_1.0.10 foreign_0.8-84
purrr_1.0.2 leiden_0.4.3.1 lifecycle_1.0.4 caret_6.0-94
glmmTMB_1.1.8 mvtnorm_1.2-4 bluster_1.12.0 lava_1.7.3
annotate_1.80.0 BiocParallel_1.36.0 timechange_0.2.0 gtable_0.3.4
ggridges_0.5.4 progressr_0.14.0 parallel_4.3.0 pROC_1.18.5
jsonlite_1.8.8 edgeR_4.0.3 bitops_1.0-7 bit64_4.0.5
spatstat.utils_3.0-4 BiocNeighbors_1.20.0 muscat_1.16.0 metapod_1.10.0
pbkrtest_0.5.2 timeDate_4022.108 lazyeval_0.2.2 shiny_1.8.0
sctransform_0.4.1 glue_1.6.2 factoextra_1.0.7 spam_2.10-0
RCurl_1.98-1.13 rprojroot_2.0.4 scran_1.30.0 gridExtra_2.3
boot_1.3-28.1 igraph_1.6.0 variancePartition_1.32.2 TMB_1.9.9
sva_3.50.0 tidyr_1.3.0 DESeq2_1.42.0 gplots_3.1.3
RcppRoll_0.3.0 cluster_2.1.4 ipred_0.9-14 nloptr_2.0.3
tidyselect_1.2.0 vipor_0.4.5 htmlTable_2.4.2 ggforce_0.4.1
AnnotationDbi_1.64.1 future_1.33.0 ModelMetrics_1.2.2.2 rsvd_1.0.5
KernSmooth_2.23-20 data.table_1.14.10 htmlwidgets_1.6.4 ComplexHeatmap_2.16.0
rlang_1.1.2 spatstat.sparse_3.0-3 spatstat.explore_3.2-5 lmerTest_3.1-3
fansi_1.0.6 hardhat_1.3.0 beeswarm_0.4.0 prodlim_2023.08.28 `

Interpreting raw and scaled ligand activity values

Hi!

Thanks for a great tool! Having all these functions to generate nice plots is very helpful!

I am not sure I understand ligand activity values correctly, so I would appreciate your feedback on it.

Am I correct that the raw values (orange on your plots) correspond to enrichment of targets of a particular ligand among all genes differentially expressed in a particular receiver cell type in both directions (up and down), and these values are mirrored for the two conditions I am contrasting?
Am I correct that these raw values (for all identified ligands in a particular receiver cell type) are then Z-scored and min-max scaled to produce pink values? If they are min-max scaled (becoming strictly non-negative), I don't understand why there are negative values in multinichenet_output$prioritization_tables$ligand_activities_target_de_tbl$activity_scaled.
I am trying to get intuition for the cases when after the scaling, the "direction" of the effect visually changes (in the attached example, pairs involving ADAM17 ligand for instance, or the last two pairs):

Are the up- and down-regulated values treated separately during normalization and scaling? I cannot understand how the ligand which had a very small, if any (because it looks white), enrichment of its targets among downregulated genes can have a high scaled activity in downregulated genes. Is it because other ligands had even smaller raw enrichment values among downregulated genes?

Thank you!

Unable to install multinichenetr

Hi,

I have been trying to install multinichenetr but it keeps giving me the same errors. This is the error message:

** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
*** arch - i386
*** arch - x64
Error: package or namespace load failed for 'multinichenetr':
.onLoad failed in loadNamespace() for 'Cairo', details:
call: library.dynam("Cairo", pkgname, libname)
error: DLL 'Cairo' not found: maybe not installed for this architecture?
Error: loading failed
Execution halted
ERROR: loading failed for 'x64'

removing 'C:/Users/sphiltje/Documents/R/win-library/4.0/multinichenetr'

I have already re-installed Cairo and I tried to use different versions of R. Nothing seems to work. My current R version is 4.0.4. I have also tried to install with the "dependencies=TRUE" option but this also didn't work. Is there anything that I can do to fix this issue?

Thanks!

Error in `dplyr::inner_join()`: Problem with `ligand` and `target`.

Hi,
When using the wrapper to run multinichenet on my data, I get the following error:

[1] "Calculate differential expression for all cell types"
[1] "DE analysis is done:"
[1] "included cell types are:"
 [1] "MAC"                     "SMACs"                   "VSM"                     "Tc"                      "Ductal.KCs"             
 [6] "Classical.MECs"          "Oral.Mucosal.KCs"        "ILCs"                    "Fibroblast.1"            "Fibroblast.2"           
[11] "Non.Classical.Monocytes" "Th"                      "MAIT"                    "CD8..IEL"                "IgA.Plasma.Cells"       
[16] "Plasmablasts"            "VECs"                   
[1] "Make diagnostic abundance plots + Calculate expression information"
[1] "Calculate NicheNet ligand activities and ligand-target links"
[1] "Combine all the information in prioritization tables"
Error in `dplyr::inner_join()`:
! Join columns in `x` must be present in the data.
✖ Problem with `ligand` and `target`.
Run `rlang::last_trace()` to see where the error occurred.
There were 35 warnings (use warnings() to see them)

Running rlang::last_trace() gave the following message:

<error/rlang_error>
Error in `dplyr::inner_join()`:
! Join columns in `x` must be present in the data.
✖ Problem with `ligand` and `target`.
---
Backtrace:
    ▆
 1. ├─multinichenetr::multi_nichenet_analysis(...)
 2. │ └─multinichenetr::multi_nichenet_analysis_combined(...)
 3. │   └─multinichenetr::lr_target_prior_cor_inference(...)
 4. │     └─... %>% dplyr::ungroup()
 5. ├─dplyr::ungroup(.)
 6. ├─dplyr::mutate(., id_target = paste(id, target, sep = "_"))
 7. ├─dplyr::inner_join(., ligand_target_df, by = c("ligand", "target"))
 8. └─dplyr:::inner_join.data.frame(., ligand_target_df, by = c("ligand", "target"))
Run rlang::last_trace(drop = FALSE) to see 4 hidden frames.

Any help you can provide regarding this error would be gratefully received!

Thanks,
Catherine

p.s. I installed multinichenet again a few days ago based on the recommendation from the error that I had been receiving then.

Bug `get_DE_info` when option `findMarkers = TRUE`

See issues #12 and #13 : bug in code of get_DE_info when option findMarkers = TRUE

Issue with abundance_expression_info, Error in `dplyr::pull()`: Caused by error: ! object 'ligand' not found

I ran get_abundance_expression_info() and encountered the debug below, I'm confused whether it's due to the dplyr package itself else.
Thank you for your sincerely answer, I am looking forward to it urgently.

abundance_expression_info = get_abundance_expression_info(sce = sce, sample_id = sample_id, group_id = group_id, celltype_id = celltype_id, min_cells = min_cells, senders_oi = senders_oi, receivers_oi = receivers_oi, lr_network = lr_network, batches = batches)
Error in dplyr::pull():
Caused by error:
! object 'ligand' not found
Run rlang::last_trace() to see where the error occurred.
There were 23 warnings (use warnings() to see them)

rlang::last_trace(drop = FALSE)
<error/rlang_error>
Error in dplyr::pull():
Caused by error:
! object 'ligand' not found

Backtrace:
▆

├─multinichenetr::get_abundance_expression_info(...)
│ ├─base::suppressMessages(...)
│ │ └─base::withCallingHandlers(...)
│ └─multinichenetr::process_info_to_ic(...)
│ └─lr_network %>% dplyr::pull(ligand) %>% unique()
├─base::unique(.)
├─dplyr::pull(., ligand)
├─dplyr:::pull.data.frame(., ligand)
│ └─tidyselect::vars_pull(names(.data), !!enquo(var))
│ ├─tidyselect:::with_chained_errors(...)
│ │ └─rlang::try_fetch(...)
│ │ ├─base::tryCatch(...)
│ │ │ └─base (local) tryCatchList(expr, classes, parentenv, handlers)
│ │ │ └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
│ │ │ └─base (local) doTryCatch(return(expr), name, parentenv, handler)
│ │ └─base::withCallingHandlers(...)
│ └─rlang::eval_tidy(expr, set_names(seq_along(vars), vars))
└─base::.handleSimpleError(...)
└─rlang (local) h(simpleError(msg, call))
```
└─handlers[[1L]](cnd)
```

  └─rlang::abort(msg, call = call, parent = cnd)

How should I set the contrast table

Hi, thank you for your nice work on this multinichenetr project. When I started using it, I encountered some problem. Here is what it is: I got a single-cell dataset consisting of 3 groups A, B and C. For my specific purpose, I want to compare A vs C and B vs C but no anything other comparisons. I wonder how should I set the contrast_oi and contrast_tbl in this situation.

Looking forward to your reply.

How to use bulk RNASeq data?

Hi, thank you for the brilliant tool! I have 8 bulk RNASeq samples (4 control, 4 treated). Can I use multinichenet to analyze it?

Are logcounts required in sce object?

Hi,

Please can you confirm whether the count data needs to be log normalized or not in the sce object? In step-by-step mode, when I run get_DE_info on my data it gives an error about not finding logcounts in the assay slot.

However, when I run the wrapper function on that same data, the DE analysis part works (it later fails when reaching the "Combine all the information in prioritization tables" section (I created an issue on github for this). I just wanted to check with you why the wrapper function does not fail if there are not logcounts in the assay slot. Are these somehow generated if using the wrapper script?

Thanks,
Catherine

senders and receivers selection for input multinichenet wrapper function

enable the possibility to specify senders and receivers in the wrapper functions

abundance_expression_info contains empty dfs

Hello,

I am running into an issue where I cannot run the prioritization_tables step from the MIS-C three-wise comparison vignette because some of the sender_receiver_info tables in the abundance_expression_info object are empty. This results in an error that says there are missing values. I am not sure how to fix this issue with the abundance_expression_info object and would really appreciate guidance.

Empty abundance expression info table example:

$sender_receiver_info$avg_df_group
# A tibble: 0 × 8
# Groups:   group, sender [0]
# ℹ 8 variables: group <chr>, sender <fct>, receiver <fct>, ligand <chr>,
#   receptor <chr>, avg_ligand_group <dbl>, avg_receptor_group <dbl>,
#   ligand_receptor_prod_group <dbl>

Error:

Error in `dplyr::mutate()`:
ℹ In argument: `scaled_LR_prod =
  nichenetr::scaling_zscore(ligand_receptor_prod)`.
Caused by error in `if (sd(x, na.rm = TRUE) > 0) ...`:
! missing value where TRUE/FALSE needed

How to manually adjust legend in make_sample_lr_prod_activity_plots()?

Hi,

Thanks again for the great package.
I have a quick question on how to manually adjust the range (or the scale) of what I plot (eg. scaled ligand activity, etc. )
I am producing multiple figures out of the same analysis and am trying to make the scale the same so I only need one legend. Let me know if this does not make sense so I can try to explain more!
Also would you mind letting me know how I can manually adjust other stuff in the figures, such as colors/heat, size of dot(if that's possible), titles...

Thank you so much and hope you have a good rest of the week.

Best,
Vivian

Error when running get_DE_info()

Hi,

Thank you so much for this amazing update.
I am trying to use it for a 3-way comparison and have been following the tutorial: https://github.com/saeyslab/multinichenetr/blob/main/vignettes/basic_analysis_steps_MISC.md

When performing the DE analysis for each cell type, however, I ran into this

Warning message:
“Unknown or uninitialised column: cluster_id.”
[1] "excluded cell types are:"
[1] "Granulosa" "Theca"
[1] "These celltypes are not considered in the analysis. After removing samples that contain less cells than the required minimal, some groups don't have 2 or more samples anymore. As a result the analysis cannot be run. To solve this: decrease the number of min_cells or change your group_id and pool all samples that belong to groups that are not of interest! "
[1] "None of the cell types passed the check. This might be due to 2 reasons. 1) no cell type has enough cells in >=2 samples per group. 2) problem in batch definition: not all levels of your batch are in each group - Also for groups not included in your contrasts!"
Error in dplyr::group_by():
! Must group by variables found in .data.
Column contrast is not found.
Column cluster_id is not found.
Traceback:

get_DE_info(sce = sce, sample_id = sample_id, group_id = group_id,
. celltype_id = celltype_id, covariates = covariates, contrasts_oi = contrasts_oi,
. min_cells = 10)
celltype_de$de_output_tidy %>% dplyr::inner_join(celltype_de$de_output_tidy %>%
. dplyr::group_by(contrast, cluster_id) %>% dplyr::count(),
. by = c("cluster_id", "contrast")) %>% dplyr::mutate(cluster_id = paste0(cluster_id,
. "\nnr of genes: ", n)) %>% dplyr::mutate(p-value <= 0.05 = p_val <=
. 0.05) %>% ggplot(aes(x = p_val, fill = p-value <= 0.05))
ggplot(., aes(x = p_val, fill = p-value <= 0.05))
dplyr::mutate(., p-value <= 0.05 = p_val <= 0.05)
dplyr::mutate(., cluster_id = paste0(cluster_id, "\nnr of genes: ",
. n))
dplyr::inner_join(., celltype_de$de_output_tidy %>% dplyr::group_by(contrast,
. cluster_id) %>% dplyr::count(), by = c("cluster_id", "contrast"))
inner_join.data.frame(., celltype_de$de_output_tidy %>% dplyr::group_by(contrast,
. cluster_id) %>% dplyr::count(), by = c("cluster_id", "contrast"))
auto_copy(x, y, copy = copy)
same_src(x, y)
same_src.data.frame(x, y)
is.data.frame(y)
celltype_de$de_output_tidy %>% dplyr::group_by(contrast, cluster_id) %>%
. dplyr::count()
dplyr::count(.)
dplyr::group_by(., contrast, cluster_id)
group_by.data.frame(., contrast, cluster_id)
group_by_prepare(.data, ..., .add = .add, error_call = current_env())
abort(bullets, call = error_call)
signal_abort(cnd, .file)

I am very confident that I have enough cells in every group and cell type. Any suggestion on what might solve the problem? Any help is highly appreciated! Thank you so much.

Regarding Ligand-receptor DB

Dear multinichenetr team,

Hello, thank you for developing a great algorithm.
I am using your algorithm to analyze cell-cell interactions in different cancer types.
In the meantime, I would like to ask you a question.

I am doing research on finding binding partner ligand molecules for human receptor genes.
The receptor genes I'm studying are largely unknown, although a few binding partners are known.

However, according to the Ligand-receptor database provided by multinichnet, several genes that have not been reported in the literature are curated as binding partners of the receptor gene I am targeting.

I have doubts about the reliability of this information because other similar databases, such as Cell chat and Cell phoneDB, do not have curated information.

Is there any way to verify that the information about the curated gene pairs comes from a paper?
Or I would like to know on what evidence they curated that these genes interact with each other.

Thank you

Error in UseMethod("inner_join") when running make_lite_output

Hi,

when running my analysis using the step-wise approach, I encounter an error when running multinichenet_output = make_lite_output(multinichenet_output)

Error in UseMethod("inner_join") :
no applicable method for 'inner_join' applied to an object of class "NULL"

I had also had an error in the previous line of code where the function lr_target_prior_cor_inference is run. However, I was advised that this was due to having <5 samples in one of my groups (which is correct). Would this have anything to do with the above error? I therefore commented out the line where lr_target_prior_cor is added to multinichenet_output.

I know that the error does not impact the analysis but wanted to flag the issue.
Thanks,
Catherine

Use on multiple samples but no different conditions

Hi NicheNet team!

I have more of a conceptual question and I was wondering if you could help me with that. I am working with a dataset that has 5 disease and 5 control samples. Because I have some cell types that are disease-specific, I can't use the comparison disease vs control in MultiNicheNet because I won't have the cells in control to compare to.
So, I was thinking about running the pipeline on the different groups separately and see what is different. Is it possible to use MultiNicheNet in just one group with multiple samples to correct for covariates and inter-sample variation? Should I just run the standard pipeline without defining the contrasts? Or should I use the original NicheNet implementation? What would you advise in this case?

Thank you very much!
Kind regards,
Nathan

circos plot

Hi,
I have a question about the circos plot. In case I have sereval cell types with similar ligands and receptors (only the combination of sender-reciever is unique) I seem to not be able to create the circos plot unless I prioritize the cell type which will express it. Is there a way to have a circos plot with rpetitions of ligands or receptors?
Thank you,
inbar

Error running multi_nichenet_analysis() in "Combine all the information in prioritization tables" step

Hi,

I got the following error when I ran multi_nichenet_analysis() on my dataset.

[1] "Calculate differential expression for all cell types"
[1] "DE analysis is done:"
[1] "included cell types are:"
[1] "NK" "Mono"
[1] "Make diagnostic abundance plots + Calculate expression information"
[1] "Calculate NicheNet ligand activities and ligand-target links"
[1] "Combine all the information in prioritization tables"
Error in metadata_combined[, c(sample_id, group_id)]:
! Can't subset columns that don't exist.
✖ Columns sample and condition don't exist.
Run rlang::last_trace() to see where the error occurred.

When I checked head(ColData(sce)), I saw the columns 'sample' and 'condition.'

DataFrame with 6 rows and 31 columns
orig.ident sample condition nCount_RNA nFeature_RNA species gene_count tscp_count mread_count bc1_well bc2_well

03_01_66__s1 A3 X01.158H Non_infected_vaccina.. 2819 1645 hg38-viral 1645 2819 10871 A3 A1
03_05_02__s1 A3 X01.158H Non_infected_vaccina.. 5237 2421 hg38-viral 2421 5237 19782 A3 A5
03_07_50__s1 A3 X01.158H Non_infected_vaccina.. 5891 2805 hg38-viral 2805 5891 23960 A3 A7
03_07_94__s1 A3 X01.158H Non_infected_vaccina.. 5864 2795 hg38-viral 2795 5864 23805 A3 A7
03_08_34__s1 A3 X01.158H Non_infected_vaccina.. 3295 1875 hg38-viral 1875 3295 12988 A3 A8
03_11_31__s1 A3 X01.158H Non_infected_vaccina.. 6728 2873 hg38-viral 2873 6728 24538 A3 A11
bc3_well bc1_wind bc2_wind bc3_wind percent.mt percent.rps percent.rpl percent.rrna percent.hb nCount_SCT nFeature_SCT SCT_snn_res.1

03_01_66__s1 F6 3 1 66 0.851366 0.319262 0.212841 0 0.0000000 3277 1645 12
03_05_02__s1 A2 3 5 2 13.710139 0.286424 0.133664 0 0.0000000 4288 2420 2
03_07_50__s1 E2 3 7 50 2.359531 0.220676 0.101850 0 0.0000000 4375 2790 5
03_07_94__s1 H10 3 7 94 3.615280 0.306958 0.272851 0 0.0000000 4411 2775 5
03_08_34__s1 C10 3 8 34 4.400607 0.303490 0.364188 0 0.0606980 3420 1875 2
03_11_31__s1 C7 3 11 31 6.019620 0.549941 0.326992 0 0.0297265 4436 2709 2
seurat_clusters SCT_snn_res.0.8 cell.name predicted.celltype.l1.score predicted.celltype.l1 predicted.celltype.l2.score predicted.celltype.l2

03_01_66__s1 10 10 03_01_66__s1 0.963905 NK 0.963905 NK
03_05_02__s1 4 4 03_05_02__s1 0.653232 Mono 0.336875 CD16 Mono
03_07_50__s1 7 7 03_07_50__s1 0.580162 Mono 0.431182 CD16 Mono
03_07_94__s1 7 7 03_07_94__s1 0.778808 Mono 0.778808 CD16 Mono
03_08_34__s1 4 4 03_08_34__s1 0.704229 Mono 0.568204 CD14 Mono
03_11_31__s1 4 4 03_11_31__s1 0.840202 Mono 0.840202 CD14 Mono
ident

03_01_66__s1 10
03_05_02__s1 4
03_07_50__s1 7
03_07_94__s1 7
03_08_34__s1 4
03_11_31__s1 4

I wondered if this error could be related to not having enough statistical power, so I set 'p_val_adj = FALSE' and 'logFC_threshold = 0.20', but I still got this error.

How can I save the high resolution plots generated from multinichenetr with R code?

I'm running multinichenetr on a high computing cluster.I would like to save the plots generated from multinichenetr workflow with R codes instead of Rstudio interface. Is there a way I can save the plots non-interactively without showing the plots? Thank you.

Condition-specific cell types - Can you run multinichenetr on datasets that have varying cell type composition?

Hi there,

Thank you for all the hard work on developing this tool!

I was wondering if/how multinichenetr can be run on a dataset that has some cell types that are condition specific. For example if I had a macrophage sub-population that was present in one condition but completely absent in the other I wouldn't be able to run the differential expression analysis. I know that CellChat has a functionality where you can harmonise the cell type annotations across conditions by setting the expression values to 0 where it is absent in a given condition, but then that matters less when the tool is only doing a comparative analysis rather than a differential expression for ligand receptor pairs between conditions.

I hope that all makes sense, I'd love to hear your thoughts or perhaps suggestions to workaround this. At the moment all I have is to try and harmonise the cell type labels by using broader annotations but then you lose the granularity of sub-clusters.

Thanks!

Olympia

All values in fraction_expressing_ligand_receptor column in prioritization_table_oi are 0

Hi,

This is related to #22

No results are returned when running get_top_n_lr_pairs. It seems that the reason for this is because in the table prioritization_tbl_oi all of the values in the fraction_expressing_ligand_receptor column are 0.

Similarly, in the same table, the columns for fraction_receptor_group and fraction_ligand_group also contain only 0s. Do you know why this might be? I saw in expression_processing.R that there is a function called fix_frq_df which fixes a muscat-bug in fraction calculation in case that expression fraction would be NA / NaN. Change NA to 0. Could this be related? If so, any ideas on a workaround so that I can proceed with the anlaysis? We're really keen to apply mnn to our data!

Thanks for your help.

Catherine

Dissecting direct vs indirect effects of simulation

I have a mixed population of in vitro cells, stimulated with an exogenous stimulant. I would like to try to dissect the direct effects of simulation on cells vs. indirect (i e. due to intercellular signalling).

With the current target correlation, I'm concerned that the predicted downstream targets of ligands seem to overlap with downstream targets of the exogenous simulation.

I wondered if you have considered this problem, and if so do you have any suggestions for how to approach it?

I thought about perhaps manually adding in the simulation as an upregulated ligand in each cell type, and then manually annotating the downstream effects to nichenet? Or, removing the known receptors for the simulation from the nichenet data?

inner_join issue

Hello,

I'm using the latest version 1.0.3, but I'm getting an inner join issue similar to prior closed issue, I've attached where it stops

[1] "Calculate differential expression for all cell types"
[1] "DE analysis is done:"
[1] "included cell types are:"
 [1] "CD8.TEM"       "CD4.TEM"       "CD4.TCM"       "HSPC"          "NK"           
 [6] "CD4.CTL"       "Treg"          "CD8.TCM"       "NK_CD56bright" "CD4.Naive"    
[11] "CD8.Naive"     "ILC"           "CD14.Mono"     "cDC2"          "Platelet"     
[16] "MAIT"          "gdT"           "B.naive"      
[1] "Make diagnostic abundance plots + Calculate expression information"
[1] "Calculate NicheNet ligand activities and ligand-target links"
[1] "Combine all the information in prioritization tables"
Error in `dplyr::inner_join()`:
! Can't convert `out[merge]$target` <logical> to match type of `target` <character>.

Best,
Chang

How to visualize between sample differences in a group?

Hi, is there a way to visualize sample differences in a group using the dot plot or any other plot? I was looking for a plot similar to the one we make for each group_oi using make_sample_lr_prod_activity_plots but highlighting differences between samples?

Warning when running lr_target_prior_cor_inference()

Hi,

Thank you for all the help you have provided.

I have encountered another error when using this package when running lr_target_prior_cor_inference():

Warning message in FUN(X[[i]], ...):
“not enough samples for a correlation analysis for the celltype Cumulus”
[1] "For no celltypes, sufficient samples (>= 5) were available for a correlation analysis. lr_target_prior_cor, the output of this function, will be NULL. As a result, not all types of downstream visualizations can be created."

This is only a warning and do not throw an error. However, this line does when I try to save it:
multinichenet_output = list(
celltype_info = abundance_expression_info$celltype_info,
celltype_de = celltype_de,
sender_receiver_info = abundance_expression_info$sender_receiver_info,
sender_receiver_de = sender_receiver_de,
ligand_activities_targets_DEgenes = ligand_activities_targets_DEgenes,
prioritization_tables = prioritization_tables,
grouping_tbl = grouping_tbl,
lr_target_prior_cor = lr_target_prior_cor
)
multinichenet_output = make_lite_output(multinichenet_output)
save = FALSE
if(save == TRUE){
saveRDS(multinichenet_output, paste0(path, "multinichenet_Theca+GratoCC.rds"))
}

This is the error:
Error in UseMethod("inner_join"): no applicable method for 'inner_join' applied to an object of class "NULL"
Traceback:

make_lite_output(multinichenet_output)
multinichenet_output$lr_target_prior_cor %>% dplyr::inner_join(LR_subset_cor,
. by = c("sender", "receiver", "ligand", "receptor")) %>% dplyr::filter(target %in%
. gene_subset)
dplyr::filter(., target %in% gene_subset)
dplyr::inner_join(., LR_subset_cor, by = c("sender", "receiver",
. "ligand", "receptor"))

What is also strange is that none of my cell types has more than 5 samples (I am running multiple multinichenet runs for different CCI between cell types)
For this one above I have:

But some of the runs will throw the error message (“not enough samples for a correlation analysis for the celltype”), while some other runs will not. Why?

Any help would be highly appreciated. Thank you.

Best,
Vivian

Issue with parallelization of `get_ligand_activities_targets_DEgenes` on some systems

Parellelization seems stalled @maartenciers

in case you encounter this issue: run the function with one core only in serial way

No results returned when running get_top_n_lr_pairs

Hi,

Sorry, me again. I am not sure whether this is a bug or just that no results were found with my dataset. When I run the following:
prioritized_tbl_oi_all = get_top_n_lr_pairs(multinichenet_output$prioritization_tables, 50, rank_per_group = FALSE)

I get an empty table:

However, there are plenty of data in multinichenet_output$prioritization_tables:

> multinichenet_output$prioritization_tables
$group_prioritization_tbl
# A tibble: 295,482 × 60
   contrast group sender receiver ligand receptor lfc_ligand lfc_receptor ligand_receptor_lfc_…¹ p_val_ligand p_adj_ligand p_val_receptor
   <chr>    <chr> <chr>  <chr>    <chr>  <chr>         <dbl>        <dbl>                  <dbl>        <dbl>        <dbl>          <dbl>
 1 Minor.S… Mino… Acina… T.Cells  PF4    CXCR3          6.21         1.31                   3.76   0.00000971    0.000257        0.00738 
 2 Minor.S… Mino… Acina… T.Cells  PF4    CXCR3          6.21         1.31                   3.76   0.00000971    0.000257        0.00738 
 3 Submand… Subm… Oral.… Fibrobl… INHBB  ACVR1B         3.74         1.4                    2.57   0.0227        0.153           0.0037  
 4 Submand… Subm… Oral.… Fibrobl… INHBB  ACVR1B         3.74         1.4                    2.57   0.0227        0.153           0.0037  
 5 Parotid… Paro… VECs   Fibrobl… POSTN  ITGB3          2.66         2.37                   2.52   0.000684      0.0288          0.00101 
 6 Parotid… Paro… VECs   Fibrobl… POSTN  ITGB3          2.66         2.37                   2.52   0.000684      0.0288          0.00101 
 7 Parotid… Paro… VECs   Fibrobl… TGM2   ITGB3          3.6          2.37                   2.99   0.00000006    0.0000768       0.00101 
 8 Parotid… Paro… VECs   Fibrobl… TGM2   ITGB3          3.6          2.37                   2.99   0.00000006    0.0000768       0.00101 
 9 Submand… Subm… Acina… Mac.Mono FURIN  INSR           1.91         2.12                   2.02   0.000262      0.00884         0.000977
10 Submand… Subm… Acina… Mac.Mono FURIN  INSR           1.91         2.12                   2.02   0.000262      0.00884         0.000977
# ℹ 295,472 more rows
# ℹ abbreviated name: ¹ligand_receptor_lfc_avg
# ℹ 48 more variables: p_adj_receptor <dbl>, activity <dbl>, direction_regulation <fct>, activity_scaled <dbl>, lr_interaction <chr>,
#   id <chr>, avg_ligand_group <dbl>, avg_receptor_group <dbl>, ligand_receptor_prod_group <dbl>, fraction_ligand_group <dbl>,
#   fraction_receptor_group <dbl>, ligand_receptor_fraction_prod_group <dbl>, rel_abundance_scaled_sender <dbl>,
#   rel_abundance_scaled_receiver <dbl>, sender_receiver_rel_abundance_avg <dbl>, lfc_pval_ligand <dbl>, p_val_ligand_adapted <dbl>,
#   scaled_lfc_ligand <dbl>, scaled_p_val_ligand <dbl>, scaled_lfc_pval_ligand <dbl>, scaled_p_val_ligand_adapted <dbl>, …
# ℹ Use `print(n = ...)` to see more rows

$sample_prioritization_tbl
# A tibble: 16,204,903 × 26
   sample                  sender receiver ligand receptor avg_ligand avg_receptor ligand_receptor_prod fraction_ligand fraction_receptor
   <chr>                   <chr>  <chr>    <chr>  <chr>         <dbl>        <dbl>                <dbl>           <dbl>             <dbl>
 1 X3545cb11.ec09.4475.80… Acina… T.Cells  ZG16B  CXCR4         10.7          4.75                 50.7               0                 0
 2 X3545cb11.ec09.4475.80… Ducta… T.Cells  ZG16B  CXCR4          9.97         4.75                 47.4               0                 0
 3 d2110830.d924.4d18.a19… Acina… ILCs     ZG16B  CXCR4         11.3          4.04                 45.5               0                 0
 4 d2110830.d924.4d18.a19… Ducta… ILCs     ZG16B  CXCR4         10.5          4.04                 42.4               0                 0
 5 fe0ae9cb.e1f3.4dfb.944… Dendr… Pericyt… S100A9 CD36           7.95         5.21                 41.4               0                 0
 6 X71f09799.9a58.4b1a.89… Acina… T.Cells  ZG16B  CXCR4         11.2          3.53                 39.7               0                 0
 7 fe0ae9cb.e1f3.4dfb.944… Dendr… Pericyt… S100A8 CD36           7.52         5.21                 39.2               0                 0
 8 ead6ff92.9050.4aec.8f8… Mac.M… Ionocyt… HLA.D… CD9            7.33         5.33                 39.1               0                 0
 9 d2110830.d924.4d18.a19… Acina… T.Cells  ZG16B  CXCR4         11.3          3.35                 37.6               0                 0
10 c6860865.01a6.4d26.b0f… Dendr… LECs     HLA.D… CD9            7.48         5.01                 37.5               0                 0
# ℹ 16,204,893 more rows
# ℹ 16 more variables: ligand_receptor_fraction_prod <dbl>, pb_ligand <dbl>, pb_receptor <dbl>, ligand_receptor_pb_prod <dbl>,
#   group <chr>, prioritization_score <dbl>, lr_interaction <chr>, id <chr>, scaled_LR_prod <dbl>, scaled_LR_frac <dbl>,
#   scaled_LR_pb_prod <dbl>, n_cells_receiver <dbl>, keep_receiver <dbl>, n_cells_sender <dbl>, keep_sender <dbl>,
#   keep_sender_receiver <fct>
# ℹ Use `print(n = ...)` to see more rows

$ligand_activities_target_de_tbl
# A tibble: 2,659,224 × 11
# Groups:   receiver, contrast [33]
   contrast             receiver ligand activity activity_scaled target ligand_target_weight logFC   p_val p_val_adj direction_regulation
   <chr>                <chr>    <chr>     <dbl>           <dbl> <chr>                 <dbl> <dbl>   <dbl>     <dbl> <fct>               
 1 Parotid.Gland-(Mino… Acinar.… A2M      0.0505           0.800 ABCA1               0.00814 2.42  1.48e-3   0.01    up                  
 2 Parotid.Gland-(Mino… Acinar.… A2M      0.0505           0.800 AHNAK               0.00726 1.72  3.12e-3   0.0164  up                  
 3 Parotid.Gland-(Mino… Acinar.… A2M      0.0505           0.800 AHR                 0.00761 3.07  1.11e-4   0.00209 up                  
 4 Parotid.Gland-(Mino… Acinar.… A2M      0.0505           0.800 ALOX5…              0.00727 2.32  9.09e-3   0.0339  up                  
 5 Parotid.Gland-(Mino… Acinar.… A2M      0.0505           0.800 ANGPT…              0.00747 2.9   3   e-4   0.00374 up                  
 6 Parotid.Gland-(Mino… Acinar.… A2M      0.0505           0.800 BCL2                0.0149  4.58  6.66e-4   0.00608 up                  
 7 Parotid.Gland-(Mino… Acinar.… A2M      0.0505           0.800 BCL2L…              0.00797 1.97  3.13e-3   0.0165  up                  
 8 Parotid.Gland-(Mino… Acinar.… A2M      0.0505           0.800 BCL3                0.00825 0.701 4.55e-2   0.107   up                  
 9 Parotid.Gland-(Mino… Acinar.… A2M      0.0505           0.800 BHLHE…              0.00945 2.13  5.05e-4   0.00518 up                  
10 Parotid.Gland-(Mino… Acinar.… A2M      0.0505           0.800 BIRC3               0.00988 2.05  1.69e-3   0.011   up                  
# ℹ 2,659,214 more rows
# ℹ Use `print(n = ...)` to see more rows

Even if I only ask for the top 1 result to be returned, the resulting table is empty. Can you please explain what the definition of top X results are? i.e. what is used to define a result making the top n list? Can you think why I may not have any results? I also tried changing rank_per_group = TRUE but that did not help. Any insight you could give would be appreciated. In case it is relevant, one of my groups has only 2 samples.

Thanks,
Catherine

Error !is.null(pbs[["group_id"]]) is not TRUE

Hello,

I am getting the following error on every cell type when calling get_DE_info:

perform_muscat_de_analysis errored for celltype: CD8+ T cells
Here's the original error message:
!is.null(pbs[["group_id"]]) is not TRUE
<simpleError in .check_pbs(pb, check_by = TRUE): !is.null(pbs[["group_id"]]) is not TRUE>
perform_muscat_de_analysis errored for celltype: CD8+ T cells

This is my code

#make sure that the gene symbols used in the expression data are also updated and made syntactically valid.
sce = alias_to_symbol_SCE(sce, "human") %>% makenames_SCE()

#Define in which metadata columns we can find the group, sample and cell type IDs
sample_id = "patient"
group_id = "disease"
celltype_id = "level_2"
covariates = NA
batches = NA 

senders_oi = SummarizedExperiment::colData(sce)[,celltype_id] %>% unique()
receivers_oi = SummarizedExperiment::colData(sce)[,celltype_id] %>% unique()

min_cells=10

abundance_expression_info$abund_plot_sample

contrasts_oi = c("'H-P','P-N','N-P'")

contrast_tbl = tibble(contrast = 
                        c('H-P','P-N','N-P'), 
                      group = c("H","P","N"))

DE_info = get_DE_info(sce = sce, sample_id = sample_id, group_id = group_id, celltype_id = celltype_id, batches = batches, covariates = covariates, contrasts_oi = contrasts_oi, min_cells = min_cells)

I have tried running it via the wrapper, as well as altering the group_ID variable and the contrasts_oi, with the same problem. I have checked my contrasts_oi are set up properly. I have 300000 cells, so plenty in each group, and have also had the same error when I have downsampled to test on a smaller object of 5000 cells.

Do you have any ideas how I can overcome this?

Thanks,

Caroline

R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /gpfs3/apps/eb/2020b/skylake/software/OpenBLAS/0.3.9-GCC-9.3.0/lib/libopenblas_skylakexp-r0.3.9.so

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C               LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8     LC_MONETARY=en_GB.UTF-8   
 [6] LC_MESSAGES=en_GB.UTF-8    LC_PAPER=en_GB.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] patchwork_1.1.2             future_1.30.0               clustree_0.5.0              ggraph_2.1.0                viridis_0.6.2              
 [6] viridisLite_0.4.1           forcats_0.5.2               stringr_1.5.0               purrr_1.0.1                 readr_2.1.3                
[11] tidyr_1.3.0                 tibble_3.2.1                tidyverse_1.3.2             SeuratObject_4.1.3          Seurat_4.3.0               
[16] multinichenetr_1.0.0        ggplot2_3.4.0               dplyr_1.1.2                 SingleCellExperiment_1.16.0 SummarizedExperiment_1.24.0
[21] Biobase_2.54.0              GenomicRanges_1.46.1        GenomeInfoDb_1.30.1         IRanges_2.28.0              S4Vectors_0.32.4           
[26] BiocGenerics_0.40.0         MatrixGenerics_1.6.0        matrixStats_0.63.0

no sample_id

hi,can i define "sample_id" = NA or others, so ignore this parameter？

Nichenet v2 networks and matrices compatibility with nichenetr pipeline

Hi,
first of all thank you for your work,
I was wondering if following your pipeline "step-by-step starting from a seurat object" (https://github.com/saeyslab/nichenetr/blob/master/vignettes/seurat_steps.md)
i could use the new "Nichenet v2 networks and matrices" that you included in the new multinichenetr (https://github.com/saeyslab/multinichenetr/blob/main/vignettes/detailed_analysis_steps_MISC.md).
So insted of download ligand_target_matrix, lr_network, weighted_networks from https://zenodo.org/record/3260758#.YlgI3S0QNQI i can use https://zenodo.org/record/5884439#.YlgI_i0QNQI.

I also note that all new files are bigger than the old ones with the exception of lr_network: lr_network.rds (old) is bigger than lr_network_human_21122021.rds (new).

Interpretation of scaled ligand activity values

Hi - thanks again for this great tool.

A quick question about the scaled ligand activity value plots.

I have one specific interaction (shown below - wnt5a -fzd6) which looks to be upregulated in my negative group on the absolute values, but then in the scaled value looks to have exactly the same values for up and down in both groups. I am not sure how to interpret this - are you able to shed any light?

Thanks

Caroline

inner_join Issue in "Intercellular regulatory network systems view"

Hi multinichenetr team!

I have followed the following vignette and really like the package:
https://github.com/saeyslab/multinichenetr/blob/main/vignettes/paired_analysis_SCC.md

Everything has worked perfectly until the last part with "Intercellular regulatory network systems view"
This is the code and error.

group_oi = "ASpecificCondition"
receiver_oi = "ASpecificCelltype"

prioritized_tbl_oi = get_top_n_lr_pairs(multinichenet_output$prioritization_tables, 150, rank_per_group = FALSE)

lr_target_prior_cor_filtered = multinichenet_output$prioritization_tables$group_prioritization_tbl$group %>% unique() %>% lapply(function(group_oi){ + lr_target_prior_cor_filtered = multinichenet_output$lr_target_prior_cor %>% inner_join(multinichenet_output$ligand_activities_targets_DEgenes$ligand_activities %>% distinct(ligand, target, direction_regulation, contrast)) %>% inner_join(contrast_tbl) %>% filter(group == group_oi) + lr_target_prior_cor_filtered_up = lr_target_prior_cor_filtered %>% filter(direction_regulation == "up") %>% filter( (rank_of_target < top_n_target) & (pearson > 0.50 | spearman > 0.50)) + lr_target_prior_cor_filtered_down = lr_target_prior_cor_filtered %>% filter(direction_regulation == "down") %>% filter( (rank_of_target < top_n_target) & (pearson < -0.50 | spearman < -0.50)) + lr_target_prior_cor_filtered = bind_rows(lr_target_prior_cor_filtered_up, lr_target_prior_cor_filtered_down) + }) %>% bind_rows()

Output i get

Joining with by = join_by(receiver, ligand, target)
Warning in inner_join(., multinichenet_output$ligand_activities_targets_DEgenes$ligand_activities %>% :
Detected an unexpected many-to-many relationship between x and y.
ℹ Row 16 of x matches multiple rows in y.
ℹ Row 107338 of y matches multiple rows in x.
ℹ If a many-to-many relationship is expected, set relationship = "many-to-many" to silence this warning.
Joining with by = join_by(contrast)
Joining with by = join_by(receiver, ligand, target)
Warning in inner_join(., multinichenet_output$ligand_activities_targets_DEgenes$ligand_activities %>% :
Detected an unexpected many-to-many relationship between x and y.
ℹ Row 16 of x matches multiple rows in y.
ℹ Row 107338 of y matches multiple rows in x.
ℹ If a many-to-many relationship is expected, set relationship = "many-to-many" to silence this warning.
Joining with by = join_by(contrast)

I also get error when plotting in the next step:
Warning in max(f) : no non-missing arguments to max; returning -Inf
Error in ggraph::geom_edge_loop():
! Problem while computing stat.
ℹ Error occurred in the 2nd layer.
Caused by error in seq_len():
! argument must be coercible to non-negative integer
Backtrace:

base (local) <fn>(x)
ggplot2:::print.ggplot(x)
ggplot2:::ggplot_build.ggplot(x)
ggplot2:::by_layer(...)
ggplot2 (local) f(l = layers[[i]], d = data[[i]])
...
ggplot2 (local) compute_statistic(..., self = self)
self$stat$compute_layer(data, self$computed_stat_params, layout)
ggplot2 (local) compute_layer(..., self = self)
ggplot2:::dapply(...)
ggplot2:::split_with_index(seq_len(nrow(df)), ids)

Error in ggraph::geom_edge_loop(aes(color = direction_regulation), edge_width = 1, :
ℹ Error occurred in the 2nd layer.
Caused by error in seq_len():
! argument must be coercible to non-negative integer

Error calling graph_plot$plot

Hi @browaeysrobin,

I am trying to generate the Intercellular regulatory network systems view you include at the end of the basic analysis vignette. When I call graph_plot$plot I get the following error:

Error in ggraph::geom_edge_loop(aes(color = direction_regulation), edge_width = 1, :

ℹ Error occurred in the 2nd layer.
Caused by error in `seq_len()`:
! argument must be coercible to non-negative integer
27. stop(fallback)
26. signal_abort(cnd, .file)
25. rlang::abort(message, ..., call = call, use_cli_format = TRUE,
.frame = .frame)
24. cli::cli_abort(c("Problem while {step}.", i = "Error occurred in the {ordinal(i)} layer."),
call = layers[[i]]$constructor, parent = cnd)
23. handlers[[1L]](cnd)
22. h(simpleError(msg, call))
21. .handleSimpleError(function (cnd)
{
{
.__handler_frame__. <- TRUE ...
20. split_with_index(seq_len(nrow(df)), ids)
19. dapply(data, "PANEL", function(data) {
scales <- layout$get_scales(data$PANEL[1])
try_fetch(inject(self$compute_panel(data = data, scales = scales,
!!!params)), error = function(cnd) { ...
18. compute_layer(..., self = self)
17. self$stat$compute_layer(data, self$computed_stat_params, layout)
16. compute_statistic(..., self = self)
15. l$compute_statistic(d, layout)
14. f(l = layers[[i]], d = data[[i]])
13. withCallingHandlers(expr, condition = function(cnd) {
{
.__handler_frame__. <- TRUE
.__setup_frame__. <- frame ...
12. doTryCatch(return(expr), name, parentenv, handler)
11. tryCatchOne(expr, names, parentenv, handlers[[1L]])
10. tryCatchList(expr, classes, parentenv, handlers)
9. tryCatch(withCallingHandlers(expr, condition = function(cnd) {
{
.__handler_frame__. <- TRUE
.__setup_frame__. <- frame ...
8. try_fetch(for (i in seq_along(data)) {
out[[i]] <- f(l = layers[[i]], d = data[[i]])
}, error = function(cnd) {
cli::cli_abort(c("Problem while {step}.", i = "Error occurred in the {ordinal(i)} layer."), ...
7. by_layer(function(l, d) l$compute_statistic(d, layout), layers,
data, "computing stat")
6. ggplot_build.ggplot(x)
5. NextMethod()
4. ggplot_build.ggraph(x)
3. ggplot_build(x)
2. print.ggplot(x)
1. (function (x, ...)
UseMethod("print"))(x)

Thank you again for the help!

Sergio

Vignettes not up-to-date

Vignettes should still be cleaned and made more up-to-date with new example dataset

How to proceed when a cell type does not have enough number of cells in a group

Hello!

Im running multinitchenet in a single cell object where I have three groups in different cell types. However when I run the

DE_info <- get_DE_info(sce = sce, sample_id = sample_id, group_id = group_id, celltype_id = celltype_id, batches = batches, covariates = covariates, contrasts_oi = contrasts_oi, min_cells = min_cells, findMarkers = TRUE)

I get the following error: ""excluded cell types are:"

How should I proceed? Should this cell types be removed? What is the community doing or the library guys doing?

issue with filtering of genes before DE analysis

muscat's default edgeR::filterByExprs is too stringent for pseudobulk data of single-cell cohort data.

own filtering solution currently fails for complex designs

Issue with dplyr::standardise_join_by() with get_abundance_expression_info

Hi all,
I am working through the basic analysis steps tutorial with with my dataset. However, I get an error when I try to run get_abundance_expression_info(). Traceback below:

20: stop(fallback)
19: signal_abort(cnd)
18: abort(c("`by` must be supplied when `x` and `y` have no common variables.", 
        i = "use by = character()` to perform a cross-join."))
17: standardise_join_by(by, x_names = x_names, y_names = y_names)
16: join_cols(tbl_vars(x), tbl_vars(y), by = by, suffix = suffix, 
        keep = keep)
15: join_mutate(x, y, by = by, type = "inner", suffix = suffix, na_matches = na_matches, 
        keep = keep)
14: inner_join.data.frame(pseudobulk_counts_celltype$sample %>% data.frame() %>% 
        tibble::rownames_to_column("sample") %>% dplyr::mutate(effective_library_size = lib.size * 
        norm.factors), pseudobulk_counts_celltype$counts %>% data.frame() %>% 
        tibble::rownames_to_column("gene") %>% tidyr::gather(sample, 
        pb_raw, -gene))
13: dplyr::inner_join(pseudobulk_counts_celltype$sample %>% data.frame() %>% 
        tibble::rownames_to_column("sample") %>% dplyr::mutate(effective_library_size = lib.size * 
        norm.factors), pseudobulk_counts_celltype$counts %>% data.frame() %>% 
        tibble::rownames_to_column("gene") %>% tidyr::gather(sample, 
        pb_raw, -gene))
12: FUN(X[[i]], ...)
11: lapply(., function(celltype_oi, pb) {
        pseudobulk_counts_celltype = edgeR::DGEList(pb@assays@data[[celltype_oi]])
        non_zero_samples = pseudobulk_counts_celltype %>% apply(2, 
            sum) %>% .[. > 0] %>% names()
        pseudobulk_counts_celltype = pseudobulk_counts_celltype[, 
            non_zero_samples]
        pseudobulk_counts_celltype = edgeR::calcNormFactors(pseudobulk_counts_celltype)
        pseudobulk_counts_celltype_df = dplyr::inner_join(pseudobulk_counts_celltype$sample %>% 
            data.frame() %>% tibble::rownames_to_column("sample") %>% 
            dplyr::mutate(effective_library_size = lib.size * norm.factors), 
            pseudobulk_counts_celltype$counts %>% data.frame() %>% 
                tibble::rownames_to_column("gene") %>% tidyr::gather(sample, 
                pb_raw, -gene))
        pseudobulk_counts_celltype_df = pseudobulk_counts_celltype_df %>% 
            dplyr::mutate(pb_norm = pb_raw/effective_library_size) %>% 
            dplyr::mutate(pb_sample = log2((pb_norm * 1e+06) + 1)) %>% 
            tibble::as_tibble() %>% dplyr::mutate(celltype = celltype_oi)
    }, pb)
10: list2(...)
9: dplyr::bind_rows(.)
8: dplyr::select(., gene, sample, pb_sample, celltype)
7: dplyr::distinct(.)
6: sce$cluster_id %>% unique() %>% lapply(function(celltype_oi, 
       pb) {
       pseudobulk_counts_celltype = edgeR::DGEList(pb@assays@data[[celltype_oi]])
       non_zero_samples = pseudobulk_counts_celltype %>% apply(2, 
           sum) %>% .[. > 0] %>% names()
       pseudobulk_counts_celltype = pseudobulk_counts_celltype[, 
           non_zero_samples]
       pseudobulk_counts_celltype = edgeR::calcNormFactors(pseudobulk_counts_celltype)
       pseudobulk_counts_celltype_df = dplyr::inner_join(pseudobulk_counts_celltype$sample %>% 
           data.frame() %>% tibble::rownames_to_column("sample") %>% 
           dplyr::mutate(effective_library_size = lib.size * norm.factors), 
           pseudobulk_counts_celltype$counts %>% data.frame() %>% 
               tibble::rownames_to_column("gene") %>% tidyr::gather(sample, 
               pb_raw, -gene))
       pseudobulk_counts_celltype_df = pseudobulk_counts_celltype_df %>% 
           dplyr::mutate(pb_norm = pb_raw/effective_library_size) %>% 
           dplyr::mutate(pb_sample = log2((pb_norm * 1e+06) + 1)) %>% 
           tibble::as_tibble() %>% dplyr::mutate(celltype = celltype_oi)
   }, pb) %>% dplyr::bind_rows() %>% dplyr::select(gene, sample, 
       pb_sample, celltype) %>% dplyr::distinct()
5: get_pseudobulk_logCPM_exprs(sce, sample_id = sample_id, celltype_id = celltype_id, 
       group_id = group_id, batches = batches, assay_oi_pb = "counts", 
       fun_oi_pb = "sum")
4: get_avg_frac_exprs_abund(sce = sce, sample_id = sample_id, celltype_id = celltype_id, 
       group_id = group_id, batches = batches)
3: withCallingHandlers(expr, message = function(c) if (inherits(c, 
       classes)) tryInvokeRestart("muffleMessage"))
2: suppressMessages(get_avg_frac_exprs_abund(sce = sce, sample_id = sample_id, 
       celltype_id = celltype_id, group_id = group_id, batches = batches))
1: get_abundance_expression_info(sce = sce, sample_id = "orig.ident", 
       group_id = "group", celltype_id = "cluster", min_cells = min_cells, 
       senders_oi = senders_oi, receivers_oi = receivers_oi, lr_network = lr_network)

Error: not enough samples when there are

Hello,
Thank you for a great tool and detailed vignettes. I have a question related to the error: "not enough samples per group with sufficient cells of this cell type". I ran into this error with a cell type (CAFs) when I know for sure that there are enough cells, see below:

Do you know why this error showed up?

Thank you!

Setting multiple covariates and batches in multinichnet

as per title. Was wondering how to set multiple covariates and batches for batch correction?

Thank you

contrast table for 4 groups

Hi,
I would like to make a contrast table for 4 groups, A, B, C, D. But I have trouble for the "group" in contrast_tbl.

contrasts_oi = c("'(C-D)-(A+B)','(C-A)+(D-B)'")
contrast_tbl = tibble(contrast =  c("(C-D)-(A+B)","(C-A)+(D-B)"),  group = c("C","C2"))

From the tutorial, the group must be chosen from A, B, C, D. Here I set group "C" for "(C-D)-(A+B)", but for (C-A)+(D-B), what should I set as C2 is not from ABCD.

Liuyang

Error polygon edge not found for ggraph_signaling_path$plot

Hi,

I ran into the following error when I was running the vignette: MultiNicheNet analysis: MIS-C threewise comparison - step-by-step with all details. This occurs when I am running the code to visualize the ‘prior knowledge’ ligand-receptor-to-target signaling paths. I am using multinichenetr 1.0.3.

ggraph_signaling_path$plot
Error in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
polygon edge not found

traceback()
40: grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y,
resolveHJust(x$just, x$hjust), resolveVJust(x$just, x$vjust),
x$rot, 0)
39: widthDetails.text(x)
38: widthDetails(x)
37: (function (x)
{
widthDetails(x)
})(structure(list(label = "IFNG", x = structure(list(list(1,
structure(list(list(0.52332558081627, NULL, 4L), list(0,
NULL, 3L)), class = c("unit", "unit_v2")), 201L)), class = c("unit",
"unit_v2")), y = structure(list(list(1, structure(list(list(0.455192223331275,
NULL, 4L), list(0, NULL, 3L)), class = c("unit", "unit_v2"
)), 201L)), class = c("unit", "unit_v2")), just = c(0.5, 0.5),
hjust = NULL, vjust = NULL, rot = 0, check.overlap = FALSE,
name = "text", gp = structure(list(col = "purple", fontsize = 12.8037401574803,
fontfamily = "Serif", lineheight = 1.2, font = c(bold = 2L)), class = "gpar"),
vp = NULL), class = c("text", "grob", "gDesc")))
36: grid.Call.graphics(C_setviewport, vp, TRUE)
35: push.vp.viewport(X[[i]], ...)
34: FUN(X[[i]], ...)
33: lapply(vps, push.vp, recording)
32: pushViewport(vp, recording = FALSE)
31: pushgrobvp.viewport(x$vp)
30: pushgrobvp(x$vp)
29: pushvpgp(x)
28: preDraw.grob(x)
27: preDraw(x)
26: drawGrob(x)
25: recordGraphics(drawGrob(x), list(x = x), getNamespace("grid"))
24: grid.draw.grob(x$children[[i]], recording = FALSE)
23: grid.draw(x$children[[i]], recording = FALSE)
22: drawGTree(x)
21: recordGraphics(drawGTree(x), list(x = x), getNamespace("grid"))
20: grid.draw.gTree(x$children[[i]], recording = FALSE)
19: grid.draw(x$children[[i]], recording = FALSE)
18: drawGTree(x)
17: recordGraphics(drawGTree(x), list(x = x), getNamespace("grid"))
16: grid.draw.gTree(x$children[[i]], recording = FALSE)
15: grid.draw(x$children[[i]], recording = FALSE)
14: drawGTree(x)
13: recordGraphics(drawGTree(x), list(x = x), getNamespace("grid"))
12: grid.draw.gTree(x$children[[i]], recording = FALSE)
11: grid.draw(x$children[[i]], recording = FALSE)
10: drawGTree(x)
9: recordGraphics(drawGTree(x), list(x = x), getNamespace("grid"))
8: grid.draw.gTree(x$children[[i]], recording = FALSE)
7: grid.draw(x$children[[i]], recording = FALSE)
6: drawGTree(x)
5: recordGraphics(drawGTree(x), list(x = x), getNamespace("grid"))
4: grid.draw.gTree(gtable)
3: grid.draw(gtable)
2: print.ggplot(x)
1: (function (x, ...)
UseMethod("print"))(x)

I also got the same error when I was working my own dataset.

DE analysis issue

Hi!

Really cool pre-print! Congratulations.

I'm trying to run your software on a data set of ours - the problem I'm running into is when I get to the DE_info analysis phase, I get the following:

[1] "In case: Error in x[[1]]: subscript out of bounds: this likely means that there are not enough samples per group with sufficient cells of this cell type. This cell type will thus be ignored for further analyses, other cell types will still be considered."
perform_muscat_de_analysis errored for celltype: HSC.MPP
Here's the original error message:
subscript out of bounds
<subscriptOutOfBoundsError in x[[1]]: subscript out of bounds>
perform_muscat_de_analysis errored for celltype: HSC.MPP

This happens for all my cell types. By and large, I get enough cells/condition before cut off (with the exception of some clusters unique to a couple of the conditions). I thought the problem might be that I only have a single sample per group? I've pasted the cell type abundance/sample graph in case helpful.

Thanks for your time - apologies if being particularly daft!

Sample Prioritizations are there, but not getting group prioritizations

Hello,

I love the new MultiNichetNet Package and have been using it for some time now. I'm running into a challenge that I'm unsure how to fix. Everything is running smoothly up until the generate_prioritization_tables step. It appears that my prioritization table by group is empty, although my prioritization table by sample is populated.

prioritization_tables$group_prioritization_tbl %>% head(20)
# A tibble: 0 × 59
# ℹ 59 variables: contrast <chr>, group <chr>, sender <chr>, receiver <chr>,
#   ligand <chr>, receptor <chr>, lfc_ligand <dbl>, lfc_receptor <dbl>,
#   ligand_receptor_lfc_avg <dbl>, p_val_ligand <dbl>, p_adj_ligand <dbl>,
#   p_val_receptor <dbl>, p_adj_receptor <dbl>, activity <dbl>,
#   direction_regulation <fct>, activity_scaled <dbl>, lr_interaction <chr>,
#   id <chr>, avg_ligand_group <dbl>, avg_receptor_group <dbl>,
#   ligand_receptor_prod_group <dbl>, fraction_ligand_group <dbl>, …

prioritization_tables$sample_prioritization_tbl %>% head(20)
# A tibble: 20 × 26
   sample     sender     receiver ligand receptor avg_ligand avg_receptor
   <chr>      <chr>      <chr>    <chr>  <chr>         <dbl>        <dbl>
 1 LCH_Liver1 Ly6c_hi_Mo moLCH    Lyz2   Itgal          5.63         2.69
 2 WT_Lung3   Ly6c_hi_Mo moLCH    Lyz2   Itgal          5.47         2.69
 3 LCH_Liver1 MoMac      moLCH    Lyz2   Itgal          5.43         2.69
 4 LCH_Lung1  Neutro     Neutro   Il1b   Il1r2          4.99         2.84
 5 WT_Lung8   Neutro     Neutro   Il1b   Il1r2          5.21         2.69
 6 LCH_Lung9  DC1        DC1      H2.DMa Cd74           2.60         5.34
 7 WT_Lung8   Ly6c_hi_Mo moLCH    Lyz2   Itgal          5.39         2.58
 8 WT_Liver3  Ly6c_hi_Mo moLCH    Lyz2   Itgal          5.66         2.37
 9 WT_Liver8  Ly6c_hi_Mo moLCH    Lyz2   Itgal          5.55         2.40
10 WT_Lung3   MoMac      moLCH    Lyz2   Itgal          4.91         2.69
11 WT_Liver3  MoMac      moLCH    Lyz2   Itgal          5.57         2.37
12 WT_Lung8   Neutro     Mast     S100a8 Cd69           5.78         2.27
13 WT_Lung8   DC1        DC1      H2.DMa Cd74           2.54         5.16
14 WT_Lung8   MoMac      moLCH    Lyz2   Itgal          5.05         2.58
15 LCH_Lung1  Ly6c_hi_Mo moLCH    Lyz2   Itgal          5.30         2.45
16 LCH_Liver1 RTM        moLCH    Lyz2   Itgal          4.80         2.69
17 LCH_Lung9  RTM        MoMac    Apoe   Trem2          5.63         2.28
18 LCH_Lung1  MoMac      moLCH    Lyz2   Itgal          5.23         2.45
19 LCH_Liver9 Ly6c_hi_Mo moLCH    Lyz2   Itgal          5.16         2.47
20 LCH_Lung9  DC1        DC2      H2.DMa Cd74           2.60         4.89
# ℹ 19 more variables: ligand_receptor_prod <dbl>, fraction_ligand <dbl>,
#   fraction_receptor <dbl>, ligand_receptor_fraction_prod <dbl>,
#   pb_ligand <dbl>, pb_receptor <dbl>, ligand_receptor_pb_prod <dbl>,
#   group <chr>, prioritization_score <dbl>, lr_interaction <chr>, id <chr>,
#   scaled_LR_prod <dbl>, scaled_LR_frac <dbl>, scaled_LR_pb_prod <dbl>,
#   n_cells_receiver <dbl>, keep_receiver <dbl>, n_cells_sender <dbl>,
#   keep_sender <dbl>, keep_sender_receiver <fct>

I was looking back at different objects and the grouping table seems fine, but I can't tell if anything else is amiss. Do you have any suggestions? Is there a threshold that I might need to play with?

saeyslab / multinichenetr Goto Github PK

multinichenetr's Introduction

multinichenetr

Main functionalities of multinichenetr

Installation of multinichenetr

Learning to use multinichenetr

Frequently recurring questions and issues

References

multinichenetr's People

Contributors

Stargazers

Watchers

Forkers

multinichenetr's Issues

Recommend Projects

Recommend Topics

Recommend Org