vmikk / metagmisc Goto Github PK
View Code? Open in Web Editor NEWMiscellaneous functions for metagenomic analysis.
License: MIT License
Miscellaneous functions for metagenomic analysis.
License: MIT License
Hello,
I am trying to run a cumulative sum scaling normalization on a phyloseq object and got an error:
sr4css <- phyloseq_transform_css(sr4, norm = TRUE, log = TRUE) # Error in dimnames(x) <- dn : length of 'dimnames' [1] not equal to array extent
I am not quite sure what this means, but was able to successfully run this transformation on the Global Patterns phyloseq object:
gp.css <- phyloseq_transform_css(GlobalPatterns, norm = TRUE, log = TRUE)
I couldn't find any obvious differences between the layout of these objects except that my object had taxa arranged as columns. I transformed this matrix and created a new object:
uto_table <- t(otu_table(sr4)) # transformed my otu table in case they needed to be rows sr4tOTU <- phyloseq(tax_table(tax_table(sr4)), sample_data(sample_data(sr4)), otu_table(uto_table), taxa_are_rows=TRUE)
sr4css <- phyloseq_transform_css(sr4tOTU, norm = TRUE, log = TRUE)
This returned the same error. Same result when I omit the norm = TRUE, log = TRUE
. Any suggestions?
I am running R 3.6.2 with phyloseq version 1.30.0, metagMisc 0.0.4, and metagenomeSeq 1.28.0.
Thanks,
Scott
Hi vmikk,
loving the package!
I just wanted to make you aware that using devtools::install_github("vmikk/metagMisc") somehow does not install the prevalence() function. I am not sure if I am doing something wrong or if it is a problem.
Best
Lukas
Hello,
I'm using the function phyloseq_filter_prevalence to filter a phyloseq object with R version 4.2.3 and it's working correctly with me, when I upgraded R version to v. 4.3.1 I got an error "OTU abundance data must have non-zero dimensions" with the same data,
R version was switched to the old version 4.2.3 to check again and it worked
the command:
physeq3 <- phyloseq_filter_prevalence(physeq.gen, prev.trh = 0.5, abund.trh = 10, threshold_condition = "OR")
Error in validObject(.Object) : invalid class “otu_table” object:
OTU abundance data must have non-zero dimensions.
thanks for your help in advance
Hi all
I've an issue with the phyloseq_mult_raref_avg function; it works on this phyloseq object.
phyloseq_summary(ps, cols = NULL, more_stats = FALSE,
+ long = FALSE)
works<-phyloseq_mult_raref_avg(ps,replace = T, SampSize = 10000, iter = 3)
..Multiple rarefaction
|=====================================================================================| 100%
..Sample renaming
..Rarefied data merging
..Splitting by sample
..OTU abundance averaging within rarefaction iterations
|=====================================================================================| 100%
..Re-create phyloseq object
But not this phyloseq object;
phyloseq_summary(p.b.a.m.s.lab, cols = NULL, more_stats = FALSE,
long = FALSE)
fails<-phyloseq_mult_raref_avg(p.b.a.m.s.lab,,replace = T, SampSize = 10000, iter = 3)
..Multiple rarefaction
|=====================================================================================| 100%
..Sample renaming
..Rarefied data merging
..Splitting by sample
Error in validObject(.Object) : invalid class “otu_table” object:
OTU abundance data must have non-zero dimensions.
validotu_table(otu_table(p.b.a.m.s.lab))
[1] TRUE
sum(is.na(otu_table(p.b.a.m.s.lab)))
[1] 0
I've psmelted it etc and all looks good no irregularities. Makes zero sense. phyloseq_mult_raref works on both...
Regards Cameron
Dear All,
I am trying to split the phyloseq after merged by genus.
I executed the following steps:
amgut_genus <- phyloseq::tax_glom(physeq, taxrank = "Genus")
taxtab <- amgut_genus@[email protected]
# Find undefined taxa (in this data set, unknowns occur only up to Family)
miss_f <- which(taxtab[, "Family"] == "f__")
miss_g <- which(taxtab[, "Genus"] == "g__")
# Number unspecified genera
taxtab[miss_f, "Family"] <- paste0("f__", 1:length(miss_f))
taxtab[miss_g, "Genus"] <- paste0("g__", 1:length(miss_g))
# Find duplicate genera
dupl_g <- which(duplicated(taxtab[, "Genus"]) |
duplicated(taxtab[, "Genus"], fromLast = TRUE))
for(i in seq_along(taxtab)){
# The next higher non-missing rank is assigned to unspecified genera
if(i %in% miss_f && i %in% miss_g){
taxtab[i, "Genus"] <- paste0(taxtab[i, "Genus"], "(", taxtab[i, "Order"], ")")
} else if(i %in% miss_g){
taxtab[i, "Genus"] <- paste0(taxtab[i, "Genus"], "(", taxtab[i, "Family"], ")")
}
# Family names are added to duplicate genera
if(i %in% dupl_g){
taxtab[i, "Genus"] <- paste0(taxtab[i, "Genus"], "(", taxtab[i, "Family"], ")")
}
}
amgut_genus@[email protected] <- taxtab
rownames(amgut_genus@[email protected]) <- taxtab[, "Genus"]
Then I tried to split the phyloseq data as follows:
amgut_split_genus <- metagMisc::phyloseq_sep_variable(amgut_genus,
"Intervention")
But got the error as:
Error in validObject(.Object) : invalid class “otu_table” object:
OTU abundance data must have non-zero dimensions.
Kindly help me
The current implementation of the prevalence
function utilizes base R functionalities for data manipulation and processing.
While this works fine for small datasets, it becomes significantly slower when handling larger, more complex datasets.
To improve the function's performance, it should be rewritten using the magnificent data.table
package which offers fast and memory-efficient methods for data manipulation.
Related with #26
My Code:
# Merge seq and tree
OTU <- otu_table(species_otu_table, taxa_are_rows = TRUE)
physeq <- phyloseq(OTU, otu_tree)
# Count
phy_SES <- phyloseq_phylo_ses(
physeq,
measures = c("PD", "MPD", "VPD"),
null_model = "richness",
package = "picante",
abundance_weighted = FALSE,
nsim = 1000,
swapiter = 1000,
)
Error information:
SES analysis started at 10:06:44
Error in if (class(dis) %in% "phylo") { : the condition has length > 1
I can used this function to calculate "PD" and "MPD", But not "VPD" parameter.
> phy_SES <- phyloseq_phylo_ses(
+ physeq,
+ measures = c("PD", "MPD"),
+ null_model = "richness",
+ package = "picante",
+ abundance_weighted = FALSE,
+ nsim = 1000,
+ swapiter = 1000,
+ )
SES analysis started at 10:12:25
..Randomizing data with 'richness' algorithm
|===================================================================================================| 100%
..Estimating phylogenetic diversity for the randomized data
|===================================================================================================| 100%
..Estimating effect size
..Done
Analysis finished at 10:12:37
I guess the "VPD" calculate got wrong.
Show "OTU" content:
> OTU
OTU Table: [17 taxa and 32 samples]
taxa are rows
LYG1 LYG2 LYG3 LYG4 LYG5 LYG6 LYG7 LYG8 LYG9 LYG10 LYG11 LYG12 LYG13 LYG14
Amblychaeturichthys_hexanema 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Chaeturichthys_stigmatias 7509 13980 5698 36438 2463 31835 2985 4479 8514 11374 13725 3886 43904 44243
Coilia_nasus 0 0 0 0 0 0 0 0 0 31 0 33 0 0
Collichthys_lucidus 1339 119 70 71 189 3421 2064 3032 1273 6879 2141 2081 0 0
Ctenotrypauchen_chinensis 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Cynoglossus_joyneri 10944 442 3136 5513 4444 4426 4067 1640 1909 2166 5682 1345 5810 4196
Engraulis_japonicus 2882 274 201 19840 530 921 5837 8303 18928 6489 28 502 21 402
Hexagrammos_otakii 69 12 0 12 37 718 7 453 1152 38 14 2264 1089 1808
Konosirus_punctatus 40 0 0 15 0 0 0 0 0 0 18 1280 3551 19
Larimichthys_polyactis 1002 0 0 0 704 53 12 15 13 14 27 1198 0 0
Liparis_tanakae 1026 105 0 0 52 577 0 0 1304 9149 1754 0 275 1161
Lophius_litulon 901 7363 1295 132 45 803 33 1578 222 62 8 1222 1402 823
Odontamblyopus_lacepedii 0 0 0 0 0 0 0 0 0 0 0 0 53 508
Pholis_fangi 971 35 1865 68 3233 235 6094 2609 12732 188 1050 2014 747 387
Scomber_japonicus 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Setipinna_taty 106 0 16 53 63 14 767 1903 1574 23 3796 25 8 0
Thryssa_kammalensis 603 11153 880 41 1014 2611 4213 24 2874 3058 3038 4066 22 2424
LYG15 LYG16 ZH1 ZH2 ZH3 ZH4 ZH5 ZH6 ZH7 ZH8 ZH9 ZH12 ZH13 ZH15 ZH16
Amblychaeturichthys_hexanema 47 28 0 0 0 0 0 0 0 0 0 0 0 0 0
Chaeturichthys_stigmatias 32321 27175 1160 9 8 0 0 0 0 2529 0 36 0 256 0
Coilia_nasus 0 0 0 0 0 0 0 0 0 0 1853 246 7 0 105
Collichthys_lucidus 58 40 0 0 0 0 0 0 9 0 0 0 540 587 9
Ctenotrypauchen_chinensis 0 0 0 0 0 0 1148 0 0 864 0 14 0 270 1479
Cynoglossus_joyneri 5682 5531 0 0 0 0 0 0 0 0 0 0 0 0 0
Engraulis_japonicus 2395 3425 6344 4096 949 2462 187 2594 2363 7442 4109 269 2084 448 128
Hexagrammos_otakii 7 70 0 0 0 0 0 0 0 0 0 0 0 0 0
Konosirus_punctatus 923 4595 0 0 0 0 0 0 0 0 0 0 0 0 0
Larimichthys_polyactis 0 19 0 0 0 0 0 0 0 0 0 0 0 0 0
Liparis_tanakae 44 0 662 67 10 24 13 0 6061 1400 1229 11768 20 465 0
Lophius_litulon 11 48 0 0 0 0 0 0 0 0 0 0 0 0 0
Odontamblyopus_lacepedii 568 19 0 0 0 0 0 0 0 0 0 0 0 0 0
Pholis_fangi 3256 101 0 0 0 0 0 0 351 0 0 0 0 8 0
Scomber_japonicus 0 0 0 0 0 3093 0 13 0 0 0 0 0 0 0
Setipinna_taty 828 1910 0 0 0 0 0 0 0 0 0 0 0 0 0
Thryssa_kammalensis 4896 152 0 0 0 0 0 0 0 0 0 0 0 0 0
ZH17 ZH18 ZH19
Amblychaeturichthys_hexanema 0 0 0
Chaeturichthys_stigmatias 6 0 21
Coilia_nasus 24 12 0
Collichthys_lucidus 0 0 7
Ctenotrypauchen_chinensis 14 12 0
Cynoglossus_joyneri 0 0 0
Engraulis_japonicus 16132 10753 7846
Hexagrammos_otakii 0 0 0
Konosirus_punctatus 0 0 0
Larimichthys_polyactis 0 0 0
Liparis_tanakae 26 1209 2092
Lophius_litulon 0 0 0
Odontamblyopus_lacepedii 0 0 0
Pholis_fangi 0 0 1956
Scomber_japonicus 0 0 0
Setipinna_taty 0 0 0
Thryssa_kammalensis 0 0 0
Show "otu_tree" content:
> otu_tree
Phylogenetic tree with 17 tips and 15 internal nodes.
Tip labels:
Amblychaeturichthys_hexanema, Chaeturichthys_stigmatias, Ctenotrypauchen_chinensis, Odontamblyopus_lacepedii, Hexagrammos_otakii, Pholis_fangi, ...
Unrooted; includes branch lengths.
How can I fix this error?
Many Thanks!
Dear @vmikk,
I suggest to substitute vegan::adonis with vegan::adonis2 because:
I can prepare a PR if needed.
Thanks.
Hi @vmikk. I am finding this package very useful. I am currently trying to perform multiple rarefaction using phyloseq_mult_raref function. What I need to achieve is an average rarified table of all the iterations. I have used the other function i.e. phyloseq_mult_raref_avg, however the output is a relative abundance matrix. I would really appreciate it if you could help me out in obtaining an averaged rarified table with absolute abundance, instead of relative abundance.
Thanks!
I want to design a logo for the readme. What dou you say?
It should be quite easy to parallelize plyr-dependent functions to speed up processing of data.
Functions that could be modified to accept parallel option:
phyloseq_mult_raref
phyloseq_sep_tax
parse_silva_tax_batch
parse_taxonomy_amptk_batch
parse_amplicon_table
- just add ...
as argument to silva_tax_parse_batch
Hi,
I am trying to see the shared OTUs from my phyloseq object.
I am running the function:
shared <- phyloseq_extract_shared_otus(physeqr, samp_names = sample_names(physeqr))
And then I obtain the following error:
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘prune_samples’ for signature ‘"array", "phyloseq"’
Any idea what is wrong?
Cheers,
Pablo
It seems that dissimilarity_to_distance
function (which transforms non-metric dissimilarity matrix to a weighted Euclidean distance) works only with data where number of samples > number of species.
Otherwise, smacof::smacofConstraint
returns error "For C diagonal the number of dimensions needs to match the number of covariates!
".
This function is based on code from Greenacre 2017 (DOI:10.1002/ecy.1937)
Hello!
I had performed several analysis using phyloseq_to_df without problem, but now I am having an error:
"Error: Sample names in 'physeq' could not be automatically converted to the syntactically valid column names in data.frame (see 'make.names'). Consider renaming with 'sample_names'.
The sample names appear to have no ivalid character (they are just numbers).
sample_names(ps)
[1] "1" "10" "11" "13" "14" "15" "2" "20" "27" "28" "3" "5" "7" "8" "9"
Any suggestions?
Thank you so much!
I believe dependencies in phyloseq_inext may be compromised. The function was working for me beautifully maybe a week ago, and now the example code associated with this function is no longer working.
R.Version() = 4.2.2 (2022-10-31)
packageVersion("phyloseq") = ‘1.42.0’
packageVersion("metagMisc") = ‘0.5.0’
See the reproducible example below:
library(phyloseq)
library(metagMisc)
data("esophagus")
phyloseq_inext(esophagus)
Error in rbind(deparse.level, ...): numbers of columns of arguments do not match
phyloseq_inext(esophagus, curve_type = "coverage")
Error in rbind(deparse.level, ...): numbers of columns of arguments do not match
phyloseq_inext(esophagus, justDF = T)
Error in rbind(deparse.level, ...): numbers of columns of arguments do not match
Hello, I can see documentation for the functions phyloseq_replace_zero and phyloseq_transform_aldex_clr, however these are not available in the R package. Is it that they don't work?
Thanks a lot!
https://rdrr.io/github/vmikk/metagMisc/src/R/phyloseq_transform.R#sym-phyloseq_transform_aldex_clr
I'm using the function phyloseq_filter_prevalence to filter a phyloseq data.
I got an error OTU abundance data must have non-zero dimensions...
df<-phyloseq_filter_prevalence(physeq = myphylo, prev.trh = 0.3, abund.trh = NULL)
Error in validObject(.Object) : 잘못된 클래스 “otu_table” 객체입니다:
OTU abundance data must have non-zero dimensions.
Could you please provide me with a solution to my problem? Thank you.
this error:
Errore in sort.list(bx[m$xi]) :
'x' dev'essere atomico per 'sort.list', metodo "shell" e "quick"
Si è chiamato 'sort' su una lista?
In aggiunta: Messaggi di avvertimento:
1: : ... may be used in an incorrect context: ‘.fun(piece, ...)’
2: : ... may be used in an incorrect context: ‘.fun(piece, ...)’
sorry for the Italian in the error, but so it is.
we try to run part of your function and the error born from: res <- merge(res, mtd, by = "SampleID")
I try to reinstall your package, R, downgrade the R version, change the environment, and so on, but nothing changed.
I build the phyloseq file with qiime2R package.
Need to implement efficient rarefaction function based on RTK.
Hi Vladimir,
I am trying to use your two functions to extract shared or non-shared otus, but I am getting an error about OTU abundance having non-zero dimensions.
To make my phyloseq object, I did the following:
taxa_tab <- tax_table(tax_tab)
samp_dat <- sample_data(metadata)
otu_tab_norm <- otu_table(otu_tab_10knorm, taxa_are_rows = T)
ps <- phyloseq(otu_tab_norm, samp_dat, taxa_tab)
Then I ran your shared otu function and received the error:
shared_esvs<-phyloseq_extract_shared_otus(ps)
"Error in validObject(.Object) : invalid class “otu_table” object:
OTU abundance data must have non-zero dimensions."
I also double checked that the OTU table did not have NAs and double checked my OTU table has reads:
table(is.na(otu_table(ps)))
FALSE
43955446
sum(colSums(otu_table(ps)))
[1] 4916230
I have used this phyloseq object for my other analyses with no problem, so I don't think it has to do with manipulation as an issue #19. Any thoughts would be helpful, thank you!
Jordan
i got a problem using prevalence
how to install 0.0.4 version?
i try
devtools::install_version("metagMisc", verstion = 0.0.4)
Error: unexpected numeric constant in "devtools::install_version("metagMisc", verstion = 0.0.4"
and
devtools::install_version("metagMisc", 0.0.4)
Error: unexpected numeric constant in "devtools::install_version("metagMisc", 0.0.4"
remotes::install_version("metagMisc", version = 0.0.4)
Error: unexpected numeric constant in "remotes::install_version("metagMisc", version = 0.0.4"
remotes::install_github("metagMisc", 0.0.4)
Error: unexpected numeric constant in "remotes::install_github("metagMisc", 0.0.4"
devtools::install_github("metagMisc",0.0.4)
Error: unexpected numeric constant in "devtools::install_github("metagMisc",0.0.4"
please any suggestion
Currently, no tests other than examples in the manual are part of the package.
See testthat package (+test workflow example).
I was planning to do multiple rarefactions, but I am unable to get the details about why rarefaction will be made for the depth equal to 0.9 * minimal observed sample size?
What is the significance behind this?
Should we switch from plyr to purrr?
Check also multidplyr (a backend for dplyr) to parallelize the code.
Hi there,
I am trying to install vmikk/metagMisc as follows:
devtools::install_github("vmikk/metagMisc")
However, I get the following error:
Downloading GitHub repo vmikk/metagMisc@master
Skipping 1 packages not available: phyloseq
✓ checking for file ‘/private/var/folders/st/y2p98lbd0y58731wvyg2rgpc0000gp/T/Rtmp2soaSi/remotese111f52b72d/vmikk-metagMisc-d25b92f/DESCRIPTION’ (338ms)
─ preparing ‘metagMisc’:
✓ checking DESCRIPTION meta-information ...
─ checking for LF line-endings in source and make files and shell scripts
─ checking for empty or unneeded directories
─ building ‘metagMisc_0.0.4.tar.gz’
Any help will be appreciated
Session information:
R version 3.6.2 (2019-12-12)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] remotes_2.1.1 phyloseq_1.30.0
loaded via a namespace (and not attached):
[1] Rcpp_1.0.4 ape_5.3 lattice_0.20-38
[4] prettyunits_1.1.1 ps_1.3.2 Biostrings_2.54.0
[7] assertthat_0.2.1 rprojroot_1.3-2 digest_0.6.25
[10] foreach_1.4.8 R6_2.4.1 plyr_1.8.6
[13] backports_1.1.5 stats4_3.6.2 ggplot2_3.3.0
[16] pillar_1.4.3 zlibbioc_1.32.0 rlang_0.4.5
[19] curl_4.3 rstudioapi_0.11 data.table_1.12.8
[22] vegan_2.5-6 callr_3.4.2 S4Vectors_0.24.3
[25] Matrix_1.2-18 desc_1.2.0 devtools_2.2.2
[28] splines_3.6.2 stringr_1.4.0 igraph_1.2.4.2
[31] munsell_0.5.0 compiler_3.6.2 pkgconfig_2.0.3
[34] BiocGenerics_0.32.0 multtest_2.42.0 pkgbuild_1.0.6
[37] mgcv_1.8-31 biomformat_1.14.0 tidyselect_1.0.0
[40] tibble_2.1.3 IRanges_2.20.2 codetools_0.2-16
[43] fansi_0.4.1 permute_0.9-5 crayon_1.3.4
[46] dplyr_0.8.5 withr_2.1.2 MASS_7.3-51.4
[49] grid_3.6.2 nlme_3.1-142 jsonlite_1.6.1
[52] gtable_0.3.0 lifecycle_0.2.0 magrittr_1.5
[55] scales_1.1.0 cli_2.0.2 stringi_1.4.6
[58] XVector_0.26.0 reshape2_1.4.3 fs_1.3.1
[61] testthat_2.3.2 ellipsis_0.3.0 Rhdf5lib_1.8.0
[64] iterators_1.0.12 tools_3.6.2 ade4_1.7-13
[67] Biobase_2.46.0 glue_1.3.2 purrr_0.3.3
[70] pkgload_1.0.2 processx_3.4.2 parallel_3.6.2
[73] survival_3.1-8 colorspace_1.4-1 rhdf5_2.30.1
[76] cluster_2.1.0 sessioninfo_1.1.1 memoise_1.1.0
[79] usethis_1.5.1
See:
Hi, could you share how you created the Prevalence plots (total OTU abundance vs OTU prevalence)?
Thanks!
Runnung the command test <-metagMisc::phyloseq_filter_prevalence(ASVnoSingletons, prev.trh = 0.10, abund.trh = NULL)
looking to filter the OTUs that appear in >10% I receive this error
Error in dimnames(x) <- dn : length of 'dimnames' [1] not equal to array extent
What it can be>
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.