kasperskytte / ampvis2 Goto Github PK

View Code? Open in Web Editor NEW

64.0 6.0 23.0 134.26 MB

Tools for visualising microbial community amplicon data

Home Page: https://kasperskytte.github.io/ampvis2/

License: GNU General Public License v3.0

R 99.61% Dockerfile 0.39%

amplicon-sequencing microbial-ecology r-package rstats

ampvis2's Introduction

Tools for visualising amplicon data

ampvis2 is an R-package to conveniently visualise and analyse 16S rRNA amplicon data in different ways.

Installing ampvis2

First, install R (3.5.x or later) and RStudio. Windows users should also install RTools. Then open RStudio as administrator (!) and run the commands below to install ampvis2 from the console:

install.packages("remotes")
remotes::install_github("kasperskytte/ampvis2")

Tip: For faster installation you can utilise multicore processors by setting the Ncpus argument, fx remotes::install_github("kasperskytte/ampvis2", Ncpus = 6). Most CPU's today can run 8 processes simultaneously, so setting it to 6 is a good starting point unless you know you have a CPU with more (logical) cores than 8.

Get started

For a quick guide on how to use ampvis2 go to the Get Started page. Detailed documentation of all ampvis2 functions can be found at the Functions page.

RStudio Docker container

A Docker container based on the rocker/rstudio image is also provided with ampvis2 preinstalled. This is ideal for complete reproducibility and portability. All you need to do is install Docker and then run:

docker run -d \
  -e "PASSWORD=supersafepassword" \
  -v "local/folder/to/mount":/home/rstudio \
  -p 8787:8787 \
  ghcr.io/kasperskytte/ampvis2:main

Access RStudio server through a browser at http://localhost:8787 with username rstudio. Ideally use a specific version tag, fx v2.7.31, instead of main to not just pull the latest image every time.

Blog posts about ampvis2

Check out the blog posts at http://albertsenlab.org/ about selected ampvis2 plotting functions. The posts include details as well as example code:

Shiny app

An interactive Shiny app with some of the basic functionality of ampvis2 can be found at: https://kasperskytte.shinyapps.io/shinyampvis

ampvis2's People

Contributors

Stargazers

Watchers

ampvis2's Issues

Ordination plots without samples.

I'm having trouble getting an ordination plot without samples. All I want is the OTU ordination plot as in phyloseq.

amp_timeseries default y-axis to ylim(0,NA)

I noticed that the defult settings in timeseries simply scales the abundance relative to min and max of the data. In most cases it would be better to start the y-axis at 0. E.g. default could be ylim(0,NA)? Or something similar instead?

"observed" in alphadiv not showing up

Add a column with the amount of unique OTU's per sample

load tree file by amp_load function

Dear ampvis2 guys,
Thank you so much for developing the wonderful R package.
My question is how to load tree file by amp_load function.
I can see the codes for loading otutable, metadata, and fasta at the link (https://madsalbertsen.github.io/ampvis2/reference/amp_load.html), but no one for tree file.

Would you please give a code for load tree file in amp_load? Thanks a lot.

amp_subset_taxa to have an option to remove singletons

It would be nice if amp_subset_taxa would support removing OTUs with only 1 read across all samples. e.g. amp_subset_taxa(data,remove_singletons=T)

Bug in amp_subset_samples()

I think a bug may have been introduced in amp_subset_samples().

I cannot get it to recognize the metadata colnames. This is the same for my data and with the MiDAS dataset in the package.

data("MiDAS")
MiDASsubset <- amp_subset_samples(MiDAS, Plant %in% c("Aalborg West", "Aalborg East"))

Error in Plant %in% c("Aalborg West", "Aalborg East"): object 'Plant' not found
Traceback:

amp_subset_samples(MiDAS, Plant %in% c("Aalborg West", "Aalborg East"))

Plant %in% c("Aalborg West", "Aalborg East")

It is the correct variable name.

MiDAS$metadata %>% colnames()

'SampleID'
'Plant'
'Date'
'Year'
'Period'

packageVersion("ampvis2")

[1] ‘2.3.18’

version

platform x86_64-conda_cos6-linux-gnu
arch x86_64
os linux-gnu
system x86_64, linux-gnu
status
major 3
minor 5.0
year 2018
month 04
day 23
svn rev 74626
language R
version.string R version 3.5.0 (2018-04-23)
nickname Joy in Playing

Add colored side bar to heatmap

Hi,

I am trying to add a side bar to the columns in heatmap to denote different categories of samples. Something similar to the picture below (done with pheatmap). Is it possible to do it in ampvis2?

Thanks in advance!

amp_heatmap (probably also others): error when metadata contains a column named "Sample"

Error when a column in the metadata is called "Sample":

library(ampvis2)
#> Loading required package: ggplot2
d <- amp_load(otutable = read.delim("~/Downloads/otutable_B.txt", check.names = F, header = T, stringsAsFactors = F),
              metadata = read.delim("~/Downloads/metadata_B.txt", check.names = F, header = T, stringsAsFactors = F))
#> Warning: Only 27 of 33 unique sample names match between metadata and otutable. The following unmatched samples have been removed:
#> otutable (6): 
#>  "MQ171129-280", "MQ171129-265", "MQ171129-275", "MQ171129-284", "MQ171129-273", "MQ171129-269"
amp_heatmap(d, group_by = "Sample", facet_by = "Day")
#> Error in `[.data.frame`(heatmap$data, , ogroup): undefined columns selected

^{Created on 2018-10-23 by the reprex package (v0.2.1)}

Traceback:

8: stop("undefined columns selected")
7: `[.data.frame`(heatmap$data, , ogroup) at amp_heatmap.R#478
6: heatmap$data[, ogroup] at amp_heatmap.R#478
5: amp_heatmap(d, group_by = "Sample", facet_by = "Day") at .active-rstudio-document#4
4: eval(ei, envir)
3: eval(ei, envir)
2: withVisible(eval(ei, envir))
1: source("~/.active-rstudio-document", echo = TRUE)

It is due to a column named "Sample" is temporarily created to generate the plot.

RDA and CCA plots with 2 values on x and y axis

Hi there! I plotted my sequencing data and metadata using ampvis in RStudio and I got 2 values on both x and y axis for my RDA and CCA graphs, and I'm not really sure how to interpret them.
I used the command amp_ordinate to prepare the graph.
RDA graph was: x axis = [10%/60/2%] and y axis = [6.2%/ 37.3%]
CCA graph was: x xis = [ 8.4% / 67.1] and y axis was = [3.6% / 28.9%]

Can someone help clarifying please?
Thank you!!

amp_subset_samples does not work with single sample selection

I am missing the taxonomy if I run amp_subset_samples when I wan to keep a selection that only matches one sample e.g.
amp_subset_samples(d, SeqID %in% c("16SAMP-17772"))

However, it works fine when I run with two or more samples e.g.
amp_subset_samples(d, SeqID %in% c("16SAMP-17772","16SAMP-17771"))

Support trees+UniFraq distances

For amp_ordinate

Trouble installing ampvis2

I used ampvis2 recently, but got a new computer to needed to re-install all my old R packages.

When trying to install ampvis2 by your instructions online, I got this:

remotes::install_github("MadsAlbertsen/ampvis2")
Error in curl::curl_fetch_memory(url, handle = h) :
Timeout was reached: Connection timed out after 10000 milliseconds

Tried installing older versions of ampvis2 with remotes::install_github("madsalbertsen/[email protected]") but got the same error.

Ideas? R version 3.5.2

Thank you!

amp_heatmap facet_by doesn't work without group_by

Rarefy argument in amp_subset_samples()

Sometimes it's nice to be able to rarefy to the same number of reads in all samples. Should be included as an option in amp_subset_samples().

amp_export_otutable() adds csv file ends

Even though the default sep is "\t", the file ending is ".csv". Maybe change it to the general ".txt" or omit it for the user to write the entire filename.

Best Rasmus

Error using RStudio for ordination in shiny package

All features in the ampVis2 Shiny package work great from my RStudio environment with the exception of ordination. Could you point me in the right direction for my error message: Error in %in%: cannot open file '~/R/x86_64-pc-linux-gnu-library/3.4/plotly/R/sysdata.rdb': No such file or directory when I try to use ordination.

shiny::runGitHub("amplicon-visualiser","Kasperskytte")
Error in %in%: cannot open file '~/R/x86_64-pc-linux-gnu-library/3.4/plotly/R/sysdata.rdb': No such file or directory
90: %in%
89: FUN
88: vapply
87: has_attr
86: FUN
85: lapply
84: supply_defaults
83: plotly_build.plotly
81: createPayload
80: origRenderFunc
79: output$ord
3: runApp
2: runUrl
1: shiny::runGitHub

ordinate colorframe labels

sample_colorframe_label is not currently affected when a metadata variable is set with sample_colorframe.

amp_export_otutable raw = F removes taxonomy in output

amp_timeseries, group_by doesn't work

Shiny doesnt work for plotting: An error has occurred. Check your logs or contact the app author for clarification.

Hi,

I am trying to use shiny ampvis2 but it doesnt work. The data does load correcttly but when I try to plot clicking on Analysis tab and then heatmap/boxplot, it gives error. I have arranged all the data as per provided examples and get the same error. I even tried to use the sample data and return to same error. Can you comment on it.

Thanks!
Arslan

shiny app

Hi,
I would like to know if there is a possibility to share your shiny application.

Best regards,

Mohamed

Add function to filter based on abundance

The function would support the "raw" workflow, where we use e.g. amp_subset_taxa() and amp_normalise().

amp_rarecurve error when N reads < stepsize

Add an error message or print which samples that were excluded from the analysis.

amp_heatmap tax_show does not handle vector input correctly

amp_heatmap returns an error when tax_show is a character vector.

amp_subset_samples prints message that is shown in rmarkdown

amp_subset_samples prints message with subsampling statistics. Useful for analysis but annoying when reporting in Rmarkdown, where extra effort has to be made to remove it. Can message be printed with warning/message status or similar, so it can be turned off with message=FALSE or warning=FALSE in Rmarkdown.

constrained PCA/RDA

I went for using constrained PCA (RDA) the other day, but it did not succeed that well.
I wanted to ask if you can make an example using this function. I want the constrained parameter to be present at the plot and 'colorgradient' similar in several of the examples in ampvis.

Also, is there a way to add a spell-check to the codes. So that, if any of the commands are spelled wrong, an alarm is given and running the codes is not possible/stopped.

In addition to this, I have experienced that some functions did not run using the XXX_XXX, but only XXX.XXX (using dot instead of underscores). But as far as I can see the functions in ampvis2 are using _.

Import QIIME+MOTHUR+BIOM formats

amp_heatmap: x-axis variables should be made to strings

When using é.g. year on the x-axis it is treated as a numeric. Convert all x-axis variables to strings instead (group_by).

Different most abundant taxa in amp_heatmap and amp_boxplot using tax_show=N

I am not sure whether this is a bug but I get different sets of taxa using amp_heatmap and amp_boxplot.
The parameter tax_show is set to 10 in both cases and the functions are called in the same manner (see below). As I understood from the provided examples the top N most abundant taxa should be plotted if tax_show is not set to "all". If this is correct it would be good to know what affects the selection of the top N taxa in both functions.

Thank you in advance!

Used version: ampvis2_2.3.2

Code snippet:

ampvis2_obj_no_contam <- ...
p_ampvis_rank         <- "Phylum"
p_ampvis_tax_add      <- NULL
p_ampvis_tax_show     <- 10

amp_heatmap(
  data=ampvis2_obj_no_contam,
  group_by=sample_group,
  tax_aggregate=p_ampvis_rank,
  tax_add=p_ampvis_tax_add,
  tax_empty="OTU",
  tax_show=p_ampvis_tax_show
)

amp_boxplot(
  data=ampvis2_obj_no_contam,
  group_by=sample_group,
  tax_aggregate=p_ampvis_rank,
  tax_add= p_ampvistax_add,
  tax_empty="OTU",
  tax_show=p_ampvis_tax_show
)

Screenshots:

amp_heatmap shows wrong numbers with many 0's

Noticed by Fredrik Bak (GEUS) where the mean of triplicate samples were not calculated correctly by dividing with 3, but more like 2.5

Axis title when using amp_boxplot with "normalise = FALSE"

Hi,

When I use amp_boxplot with "normalise = FALSE", the x-axis is the reads number instead of the relative abundance. However, the x-axis title is still "Read Abundance (%)". Could you help to fix it?

Thank you!

# Export otutable with one sample bug

amp_export_otutable returns an error if attempting to export otutable with one sample.
textmap_test_otutable.txt

# Test code
library(ampvis2)
library(tidyverse)

otutable <- read_delim("textmap_test_otutable.txt", delim = ",")
metadata <- data.frame(SeqID = "MQ170823-258", Var1 = "test")

d <- amp_load(otutable, metadata)

amp_heatmap(d)

amp_export_otutable(d, "test_otutable.txt")

sort_by and order_x/y_by in amp_heatmap not working correctly

Installation info needed in description

library("devtools")
install_github("MadsAlbertsen/ampvis2")

replace aes_string() uses when ggplot3.0 comes live on CRAN

Multiple envfit arrows in one plot.

Hi Kasper,
I was wondering if its possible to fit multiple envfit arrows in one plot like it can be done in vegan.

Thanks,

Cris.

Heatmap tax_empty="best"

Hi
It would be great if amp_heatmap with tax_empty="best" did not write identical tax strings twice when there is no taxonomy for e.g. both Phylum and Genus.

CCA analyses

Hi, I am trying to run a CCA but I get the following error, Error in cca.default(inputmatrix, ...) :
all row sums must be >0 in the community data matrix.

amp_ordinate(d,
             type = "CCA",
             transform = "Hellinger",
             sample_color_by = "Cellar_location", sample_plotly = "all", filter_species = 10, species_plot = TRUE, species_nlabels = 15, species_shape = 20,
species_rescale = TRUE)

I have previously filtered the data using the ampvis_subset sample which as indicated automatically removes absent OTUs?, however I still get this error.

Thanks for your help.

amp_load dummy data bug

amp_load object structure wrong if no metadata is provided.


# Test code
library(ampvis2)
library(tidyverse)

otutable <- read_delim("textmap_test_otutable.txt", delim = ",")

d <- amp_load(otutable)

amp_heatmap(d)

textmap_test_otutable.txt

amp_alphadiv reports rarefied read counts. Should be raw.

amp_alphadiv reports rarefied readcounts. I think it would make sense to still include the raw number of reads. Maybe as "RawReads" - or simply just report reads as "RawReads" as far as I remember this is also how it used to behave.

species_nlabels

Hej
Jeg er lidt usikker på hvad termen 'extreme' henviser til i nedenstående.

species_nlabels Number of the most extreme species labels to plot (Only makes sense with PCA/RDA).

Er der tale om mest abundante eller dækker det over andet?
Hvorledes bestemmes 'most extreme' species?

På forhånd mange tak.

amp_venn issue

Hi, I am trying to run amp_venn on a dataset but I get the following error,

amp_venn(d, cut_a = 10, group_by = "Winery")
Error in amp_venn(d, cut_a = 10, group_by = "Winery") :
object 'p' not found

Any ideas?

Thanks,

Cris

ampvis_heatmap textmap bug

Missing rows when using ampvis_heatmap textmap option.

# Test code
library(ampvis2)
library(tidyverse)

otutable <- read_delim("textmap_test_otutable.txt", delim = ",")
metadata <- data.frame(SeqID = "MQ170823-258", Var1 = "test")

d <- amp_load(otutable, metadata)

amp_heatmap(d)

taxanames1 <- amp_heatmap(d,
                          tax_aggregate = "Genus", 
                          tax_show = 50,
                          tax_add = "Phylum", 
                          tax_class = "p__Proteobacteria",
                          tax_empty = "OTU") %>%
  .[["data"]] %>%
  select(Display) %>%
  unlist() %>%
  as.character()


taxanames2 <- amp_heatmap(d,
                          tax_aggregate = "Genus", 
                          tax_show = 50,
                          tax_add = "Phylum", 
                          tax_class = "p__Proteobacteria",
                          tax_empty = "OTU",
                          textmap = T) %>%
  row.names()

taxanames1[!(taxanames1 %in% taxanames2)]

textmap_test_otutable.txt

error when loading phyloseq object with ASVs

Hi,

I am trying to import a phyloseq object of ASVs made with DADA2.
I have renamed the ASVs ASV1, ASV2....

> nq1F515R806psF
phyloseq-class experiment-level object
otu_table()   OTU Table:         [ 1890 taxa and 28 samples ]
sample_data() Sample Data:       [ 28 samples by 4 sample variables ]
tax_table()   Taxonomy Table:    [ 1890 taxa by 7 taxonomic ranks ]

I used you recipe for importing phyloseq objects, but get an error.
Here are my commands:

> obj <- nq1F515R806psF
> otutable <- data.frame(
    OTU = rownames(phyloseq::otu_table(obj)@.Data),
    phyloseq::otu_table(obj)@.Data,
    phyloseq::tax_table(obj)@.Data,
    check.names = FALSE
  )
Error in data.frame(OTU = rownames(phyloseq::otu_table(obj)@.Data), phyloseq::otu_table(obj)@.Data,  : 
  arguments imply differing number of rows: 28, 1890

Could you help me figure out what I am doing wrong?

Thanks,
Camilla

Bray-curtis similarity plot

Viste dig det der lækre Bray-curtis similarity plot. Kunne være lækkert som en ekstra-funktion til amp_ordinate.

amp_alphadiv scrambles the reported data

Currently the data$metadata and data$abund is not sorted in the same order. Hence, the following 2 lines of code scrambles the number of reads reported per sample (and the other measures I guess).

results <- data$metadata
results$Reads <- colSums(data$abund)

If possible, then always use merge/join instead. It's too risky to assume that order is preserved across data frames.

amp_core, plotly only works with OTUs

time_variable in amp_timeseries

Hi!

I am trying to plot read abundance data using amp_timeseries. The data came out nicely but I am just wondering if there is a way to change the x-axis to Day 0, 1, 2, 3, etc. instead of the as_date format in time_variable.

Thanks for your time...

ampvis2 with phyloseq objects

I have been using ampvis2 for a while now and successfully loading data through amp_load.
Today, I have just read here that ampvis works on phyloseq objects.

However, if I try to run an ampvis2 function on a phyloseq object (GM), I've got an data format error.

amp_ordinate(data = GM, type="RDA", constrain="Site")
Error: The provided data is not in ampvis2 format. Use amp_load() to load your data before using ampvis2 functions. (Or class(data) <- "ampvis2", if you know what you are doing.)

Is there really a way to use ampvis directly on phyloseq objects or data need to be transformed before using ampvis?

Thank you!

"Getting Started"

This fantastic! I wish I about it sooner, but am glad it exists and am definitely using it now.

The Read Me instructions for downloading ampvis2 work on any version v3.4.1 or later
install.packages("remotes")
remotes::install_github("MadsAlbertsen/ampvis2")

This is not the case if one tries by
install.packages("ampvis2")

May be repetitive, but possibly worth adding a note on the Getting Started page.

Thanks for this!

Sabah

kasperskytte / ampvis2 Goto Github PK

ampvis2's Introduction

Tools for visualising amplicon data

Installing ampvis2

Get started

RStudio Docker container

Blog posts about ampvis2

Shiny app

ampvis2's People

Contributors

Stargazers

Watchers

Forkers

ampvis2's Issues

Recommend Projects

Recommend Topics

Recommend Org