cdsrplots contains standard CDS plots and themes.
library(devtools)
devtools::install_github("broadinstitute/cdsr_plots")
The package can then be loaded by calling
library(cdsrplots)
- Plotting
- make_volcano
- make_gsea_dot
- make_gsea_bar
- Theme
- theme_publication
- scale_fill_publication
- scale_color_publication
The make_volcano
function is a quick and easy tool to visualize
results from differential expression/dependency analyses and can be
easily customized.
As an example we will load the results of a differential expression anlyses comparing Nutlin treated cells to DMSO treated cells.
nutlin <- read_csv("./nutlin.csv")
nutlin %>% head()
## # A tibble: 6 x 3
## gene logFC p_value
## <chr> <dbl> <dbl>
## 1 CDKN1A 1.31 2.82e-166
## 2 MDM2 0.994 5.12e-155
## 3 GDF15 0.977 3.33e-146
## 4 SUGCT 0.708 1.03e-139
## 5 RPS27 0.366 3.80e- 97
## 6 FDXR 0.679 1.04e- 96
Make a simple plot by providing the data frame and variable names for effect size and significance.
cdsrplots::make_volcano(nutlin, 'logFC', 'p_value')
Provide label_var
, a column containing labels for the data points.
- By default
rank_by
= ‘effect’ andn_labeled
= 10, meaning that the 10 left-/right-most points will be labeled, ranked by effect size - If
rank_by
= ‘pval’, then_labeled
most significant points will be labeled - Otherwise, user may specify custom points to label with logical
vector
label_bool
.
cdsrplots::make_volcano(nutlin, 'logFC', 'p_value', label_var = 'gene')
nutlin %>% dplyr::mutate(top_5_sig = rank(p_value) <= 5) %>%
cdsrplots::make_volcano('logFC', 'p_value', label_var = 'gene',
label_bool ='top_5_sig',ggrepel_type = 'label')
Provide color_var
, a categorical column for coloring purposes.
- By default, a grey background/red highlight dual color scheme is used for logical vectors. For all other vector classes, colors are set to categories arbitrarily.
- Otherwise, user may specify custom colors to use with color_values.
nutlin %>% dplyr::mutate(fdr = p.adjust(p_value, method = 'fdr')) %>%
cdsrplots::make_volcano('logFC', 'p_value', q_var = 'fdr')
location_colors <- c('left' = '#D95F02', 'right' = '#7570B3', 'bottom' = '#333333')
nutlin %>%
dplyr::mutate(location = ifelse(logFC < 0, 'left', 'right')) %>%
dplyr::mutate(location = ifelse(p_value > 1e-10, 'bottom', location)) %>%
cdsrplots::make_volcano('logFC', 'p_value', color_var = 'location', color_values = location_colors)
Since make_volcano returns a ggplot object, you can make more tweaks using the ggplot syntax.
cdsrplots::make_volcano(nutlin, 'logFC', 'p_value') + cdsrplots::theme_publication()
volcano <- nutlin %>% dplyr::mutate(top_5_sig = rank(p_value) <= 5) %>%
cdsrplots::make_volcano('logFC', 'p_value', label_var = 'gene',color_var = 'top_5_sig',
label_bool ='top_5_sig',ggrepel_type = 'label')
volcano + aes(size = top_5_sig) + scale_size_manual(values = c(1,3))
The make_gsea_dot
and make_gsea_bar
functions are a quick and easy
way to visualize the results of a gene set enrichment analysis. The
functions are designed to work with the
cdsrgsea package but they
are fexible enough to work with other GSEA packages.
As an example we will look for enriched gene sets from the HALLMARK collection using the hypergeometric test.
gene_sets <- cdsrgsea::load_gene_sets()
nutlin_gsea <- cdsrgsea::run_hyper(nutlin,gene_sets$Hallmark,gene_var = "gene", rank_var = "logFC")
nutlin_gsea %>% head()
## # A tibble: 6 x 8
## term p_value p_adjust odds_ratio direction size overlap_size overlap
## <chr> <dbl> <dbl> <dbl> <chr> <int> <int> <list>
## 1 HALLMARK_P5… 5.52e-21 2.76e-19 25.8 pos 200 24 <chr […
## 2 HALLMARK_E2… 6.37e-20 3.19e-18 19.8 neg 200 25 <chr […
## 3 HALLMARK_G2… 7.38e-14 1.85e-12 13.2 neg 200 20 <chr […
## 4 HALLMARK_MY… 8.95e- 8 1.49e- 6 7.61 neg 200 14 <chr […
## 5 HALLMARK_AP… 3.33e- 5 8.32e- 4 6.70 pos 161 9 <chr […
## 6 HALLMARK_TN… 1.81e- 4 3.02e- 3 5.28 pos 200 9 <chr […
Make a simple bar plot by providing the data frame returned by
cdsrgsea
.
cdsrplots::make_gsea_bar(nutlin_gsea)
The default variable names match the variable names returned by
cdsrgsea
.
enrich_var
specifies the column containing the enrichment values. By defaultenrich_var
= ‘odds_ratio’ for hypergeometric and ‘NES’ for GSEA.size_var
specifies the column containing the sizes. By defaultsize_var
= ‘overlap_size’ for hypergeometric and ‘size’ for GSEA.p_var
specifies the column contianing the significance values. By defaultp_var
= ‘p_value’
Variable names can changed to work with other GSEA packages or to
customize the plots. For example we can setp_var
to ‘p_adjust’
instead of ‘p_value’.
cdsrplots::make_gsea_dot(nutlin_gsea,p_var = 'p_adjust')
The direction parameter dir
can be set to ‘pos’ to only shown positive
terms or ‘neg’ to only show negative terms
cdsrplots::make_gsea_dot(nutlin_gsea,dir = "pos")
The color_by
argument sets how the plot is colored. There are three
options
- ‘pval’ - colors by significance
- ‘dir’ - colors by the direction of the enrichment
- ‘enrich’ - colors by enrichment
cdsrplots::make_gsea_bar(nutlin_gsea, color_by = "dir")
The x_by
argument sets which variable is plotted on the x-axis. There
are two options
- ‘pval’ - the significance is plotted on the x-axis
- ‘enrich’ - the enrichment is plotted on the x-axis
cdsrplots::make_gsea_bar(nutlin_gsea, color_by = "enrich", x_by = "pval")
There are a number of parameters that modify the y-axis
n_shown
sets the number of terms which are shown.sig_only
determines whether only significant terms are shown.
cdsrplots::make_gsea_dot(nutlin_gsea,sig_only = T)