dkahle / titan2 Goto Github PK

Threshold Indicator Taxa Analysis

R 5.07% HTML 94.93%

titan2's Introduction

TITAN2

TITAN2 is the second R implementation of Threshold Indicator Taxa ANalysis. It is an R package source controlled with Git on GitHub and distributed on CRAN.

To learn more about TITAN2, check out the vignette here (you can click Download to view it in a separate window).

Note: a previous version of this readme stated that you could read the vignette; however, the vignette is not built when the package is downloaded from GitHub, so just access it as above.

Installation

You can install TITAN2 in either of two ways. At the present time, we recommend installing TITAN2 from GitHub, as it has several new features, e.g. plot_taxa_ridges().

From Github (dev version):

if (!requireNamespace("devtools")) install.packages("devtools")
devtools::install_github("dkahle/TITAN2")

From CRAN:

install.packages("TITAN2")

Acknowledgements

This work continues to be supported by the Department of Geography and Environmental Systems (UMBC), Department of Biology (Baylor), and Department of Statistical Science (Baylor).

titan2's People

Stargazers

Watchers

Forkers

matthewebaker joshualiuxu zhao-hx sergemayombo

titan2's Issues

Species matrix input type?

The documentation for the TITAN2 package suggests the input species matrix should be count data. Does it matter whether those data are counts or relative abundances?

Error when memory=TRUE; length of 'dimnames' [2] not equal to array extent

Hello, I have observed an error when running TITAN:

test <- titan(ENV20, TAXA20, memory=TRUE)

Error in matrix(unlist(value, recursive = FALSE, use.names = FALSE), nrow = nr,  : 
  length of 'dimnames' [2] not equal to array extent

This error seems to appear only when running the memory=TRUE flag for datasets that warn that they have a low number of pure and reliable data and have no z- or z+ taxa, like so:

test <- titan(ENV20, TAXA20, memory=FALSE)

Number of z- taxa = 0, Number of z+ taxa = 0
Warning message:
In titan(ENV20, TAXA20) :
  Low number of pure and reliable taxa, sum(z) output should be interpreted with caution

These datasets will run and return with the warning above if memory=FALSE. There is also no error when datasets do have sufficient data and produce z- and z+ taxa as in the vignette data.

Attached is a small set of sample data in which I have observed the error when memory=TRUE:
ENV20.txt
TAXA20.txt

ridge plot weird behavior when total number of pure and reliable taxa > 1000

First--the new plots are great!

I'm analyzing a relatively large 16S dataset and found some strange behavior in the ridge plots when a gradient has more than 1000 combined z-/z+ taxa and you try to limit the number of taxa plotted for readability.

Specifically, only the z- taxa will show, and at the bottom of the graph will be a mess of overwritten text that is the z+ taxa.

When you split the taxa plot (e.g. FALSE for plot z1) there is further weird behavior. Setting z2 to FALSE is fine--everything works as expected. Setting z1 to false works, but only 2-3 taxa are plotted, and they are very strange looking. I had to set the number of taxa plotted to ~300 to get roughly 60 to plot (this changes based on the titan output you're plotting).

Thanks!

titan boot problems

glades_titan_test <- titan(glades.env, glades.taxa, nBoot=1)
# Screening taxa...
#   100% occurrence detected 1 times (0.6% of taxa),
#   use of TITAN less than ideal for this data type.
#   taxa frequency screen complete.
# Partitioning along gradient...
# Calculating observed IndVal maxima and class values...
# Calculating IndVals using mean relative abundance...
# Permuting IndVal scores...
# Summarizing observed results...
# Estimating taxa change points using z-score maxima...
# Bootstrap resampling in sequence...
# 1.. Error in rowMeans(metricArray[, 1, ] == 1, na.rm = T) : 
#   'x' must be an array of at least two dimensions

glades_titan_test <- titan(glades.env, glades.taxa, ncpus = 8, nBoot = 2)
# Screening taxa...
#   100% occurrence detected 1 times (0.6% of taxa),
#   use of TITAN less than ideal for this data type.
#   taxa frequency screen complete.
# Partitioning along gradient...
# Calculating observed IndVal maxima and class values...
# Calculating IndVals using mean relative abundance...
# Permuting IndVal scores...
# Summarizing observed results...
# Estimating taxa change points using z-score maxima...
# Bootstrap resampling in parallel using 8 CPUs... no index will be printed to screen
# Error in ivz.bt.list[[s]][[l]] : 
#   attempt to select less than one element in integerOneIndex

Can we use TITAN2 without having an environmental gradient ?

Hello,

I'm actually runing TITAN2 perfectly and I don't have any technical issus,but I would like to know if TITAN2 can be used and interpreted without a problem when we don't have an environmental gradient ?

My data is composed of multiple soil samples taken from two types of wastes, for these samples we run a 16S rRNA sequencing and have the concentrations of some environmental variables. These concentrations somtimes varies between 0.75 to 4.75 for the same environmental parameter in the different samples. So can we consider this variation as "gradient" and use TITAN2 in our data to show which taxa varies with concentrations of physicochemical parameters ?

Any confirmation or suggestion would be much appreciated,

Najoua

Update parallelization

CRAN check issues the following note:

Uses the superseded package: ‘snow’

Migrate to using patchwork

The patchwork package should make the internals of some functions, e.g. plot_sumz_ridges(), much easier. It will probably be easiest if we use this instead of cowplot.

100% occurrence detected 1 times

Hello -
I am trying to use the TITAN function and am following the vignette. I continue to get the following error. I am using 16s rRNA data and have ASVs as my taxa matrix. I have transformed to relative abundance and tried to remove taxa that are not present in 5 or more samples. I have tried filtering a few ways and still keep getting this same error.

100% occurrence detected 1 times (100.0% of taxa),
use of TITAN less than ideal for this data type.
taxa frequency screen complete.
Partitioning along gradient...
Error in env.part(env, taxa, minSplt = minSplt, messaging = messaging) :
Number of sites not equal between env vector and taxa matrix

I am not sure why I am getting that my number of sites is not equal between the env vector and taxa matrix. Both of my data frames have 69 rows made up of the "sources". The taxa matrix has bacterial ASVs as columns.

Can you give more details on both parts of this error (100% occurrence and number of sites).

Thank you,
Mia

dkahle / titan2 Goto Github PK

titan2's Introduction

TITAN2

Installation

Acknowledgements

titan2's People

Stargazers

Watchers

Forkers

titan2's Issues

Species matrix input type?

Error when memory=TRUE; length of 'dimnames' [2] not equal to array extent

ridge plot weird behavior when total number of pure and reliable taxa > 1000

titan boot problems

Can we use TITAN2 without having an environmental gradient ?

Update parallelization

Migrate to using patchwork

100% occurrence detected 1 times

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent