iferres / pagoo Goto Github PK

View Code? Open in Web Editor NEW

24.0 2.0 4.0 39.58 MB

A comprehensive and intuitive encapsulated OO class system for analyzing bacterial pangenomes in R.

Home Page: https://iferres.github.io/pagoo/

R 100.00%

pangenome r rstats r6 microbial-genomics bioinformatics

pagoo's People

Stargazers

Watchers

Forkers

giraola rpucheq tauqeer9 malihaaziz

pagoo's Issues

Bug when `$add_metadata()` with missing key at the end

The function works fine, except when there're missing key at the end of the feeding data.frame, i.e.:

> p$organisms
DataFrame with 5 rows and 2 columns
        org        sero
   <factor> <character>
1 organismA           a
2 organismB           b
3 organismC           c
4 organismD           a
5 organismE           b

p$add_metadata("org", data.frame(org=p$organisms$org[-5], Country=c(letters[1:4])))
 Error in `[[<-`(`*tmp*`, nwcls[i], value = c("a", "b", "c", "d")) : 
  4 elements in value to replace 5 elements

If instead of removing the last element from the metadata data.frame, any other organism is removed, then it works fine.

$add_metadata() should override columns with same name

If I pass a data.frame with a column name that already exist, the actual behaviour is to repeat the column. The desired behaviour is to override existing columns with the same name as the data.frame provided. Also should provide a method to remove columns.

Re-assign genes to other clusters

Pangenome manipulation implies that refinement method (even manual curation) could identify miss-assignations. A gene should be able to be reassigned to other (potentially new) cluster.

Code coverage - shiny on tests

Now code coverage is about 40%. All the code is, I think, very well tested. BUT, I couldn't figure out how to test the shiny app, which is huge from the point of view of number of code lines. That is dropping the coverage to 40%.

Failing to process gene_presence_absence.csv file

Hello,

Has anybody encountered the same issue as me? When I load the csv file, I got this error:
gffs <- list.files(path = "../gffs/", pattern = "[.]gff$", full.names = TRUE)
gpa_csv <- "gene_presence_absence.Rtab"

library(pagoo)

p <- roary_2_pagoo(gene_presence_absence_csv = gpa_csv, gffs = gffs)

Reading csv file (roary).
Processing csv file.
Error in data.frame(..., check.names = FALSE) :
arguments imply differing number of rows: 180601, 0

gene IDs from Panaroo (1.3.2) presence/absence matrix file do not match IDs in GFF files

I am running panaroo 1.3.2, pagoo 0.3.17, both are latest versions.

I noticed that in the matrix files produced by Panaroo (gene_presence_absence_roary.csv and gene_presence_absence.csv), there are two types of gene IDs that are not present in the GFF files, for example: "607_refound_10510" (I do not know where they come from), and "GCF_004322615.1_00534;GCF_004322615.1_00535" (two genes concatenated and separated by ";").

In my run, there are <2% genes like this. I can run a script to replace these items to "" in order to make panaroo_2_pagoo to run. I am wondering whether panaroo_2_pagoo can have an extra step to filter out genes not in gff, and output a warning message and error log to let users know what genes are removed, so that users can decide whether it is acceptable, or at least it helps to troubleshoot.

panaroo_2_pagoo error

Hi, I am following the panaroo_2_pagoo instructions exactly and using the gene_presence_absence.csv file (unchanged from its output from panaroo) and I am getting the following error:
gffs <- list.files(path = "full_path/new_spades_annotations/",
pattern = "[.]gff$",
full.names = TRUE)
gpa_csv <- "full_path/panaroo_spades/gene_presence_absence.csv"

library(pagoo)
pg <- panaroo_2_pagoo(gene_presence_absence_csv = gpa_csv,
gffs = gffs)

Reading csv file (panaroo).
Processing csv file.
Warning in panaroo_2_pagoo(gene_presence_absence_csv = gpa_csv, gffs = gffs) :
Removing refound genes with stop codon (tagged with '_stop')
Error in df[[COL]][[ROW]] <- df[[COL]][[ROW]][-INDEX] :
replacement has length zero

I have tried a number of things and I do not seem to be getting anywhere! Can anyone help?

Improve internal ggplot2 functions handling

See https://ggplot2.tidyverse.org/articles/ggplot2-in-packages.html

Reading gffs: Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : row names contain missing values

          Thankyou! I repulled/installed pagoo via devtools. there is progress but now i see this error

p <- panaroo_2_pagoo(gene_presence_absence_csv = gpa_csv, gffs = gffs)
Reading csv file (panaroo).
Processing csv file.
Removing 314 genes tagged as 'refound', 'stop', and/or 'length' by panaroo.
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/01_A1.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/02_B4.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/02_F4.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/03_C2.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/05_A8.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/33_A7.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/44MNt_B4_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/63VAs-B3_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/63VAs-Sm1_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/68VAs-B3_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/68VPs-B6_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/81UNt-Sm4_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/83VAs-Sm8_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/83VPs-KB5_GCF_007197715.1_ASM719771v1_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/87UNt-Sm4_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/88MNs-Sm2_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/88VPs-Sm9_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/90VAs-B6_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/90VAs-Sm9_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/9VPs-B5_contigs.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/ATCC-51524_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/Dp_81Mnt_Sm4.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL1914_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL1922_CDC39-95_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL1931_CDC4294-98_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL1933_CDC4545-98_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL1934_CDC4709-98_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL1937_CDC4199-99_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL1939_CDC4792-99_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3033_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3043_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3050_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3052_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3065_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3069_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3070_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3077_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3084_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3086_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3090_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3246_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3250_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3256_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3264_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3274_genomic.gff3
Reading gff file /lustre/groups/liu_price_lab/mlaziz/Dp/panaroo-020923-bakta/gff/KPL3911_genomic.gff3
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, :
row names contain missing values

Originally posted by @malihaaziz in #59 (comment)

Change Shell and Cloud levels

Hi there,
I'm really enjoying Pagoo, but was wondering if there is a means of adjusting the percentage cut offs for Shell and Cloud genes, much like there is for Core genes?
From my rough view, Pagoo is treating all genes that occur in only 1% (or perhaps 1 single?) genome as a cloud gene, whilst tools such as Panaroo treat genes present in fewer than 15% of genomes as Cloud genes.
I'd like to get my data to marry up across tools, and therefore set the core, shell, and cloud values in Pagoo to mirror those in Panaroo. Obviously I can change the core, but it would be great to be able to change the rest :)
Thanks for any help you can give, even if there is a hacky way I can do it

Failed with error: ‘package ‘S4Vectors’ required by ‘pagoo’ could not be found’

I´m trying to install a specific package "pagoo", so I use the following command:

# Packages
packages <- c("ape","ggplot2","vegan","philentropy", "pagoo") # Specify your packages
# Install and libraries
package.check <- lapply(
  packages,
  FUN = function(x) {
    if (!require(x, character.only = TRUE)) {
      install.packages(x, dependencies = TRUE)
      library(x, character.only = TRUE)
    }
  }
)

The last package is the problem: pagoo. When I tried to install it, there is an error:

Failed with error:  ‘package ‘S4Vectors’ required by ‘pagoo’ could not be found’
Warning in install.packages :
  dependencies ‘S4Vectors’, ‘Biostrings’, ‘GenomicRanges’, ‘BiocGenerics’, ‘DECIPHER’, ‘IRanges’ are not available

Show Traceback:
Error: package ‘S4Vectors’ required by ‘pagoo’ could not be found
5.
stop(gettextf("package %s required by %s could not be found", 
sQuote(pkg), sQuote(pkgname)), call. = FALSE, domain = NA)
4.
.getRequiredPackages2(pkgInfo, quietly = quietly)
3.
library(x, character.only = TRUE)
2.
FUN(X[[i]], ...)
1.
lapply(packages, FUN = function(x) {
if (!require(x, character.only = TRUE)) {
install.packages(x, dependencies = TRUE)
library(x, character.only = TRUE) ...

So, first I tried to install pagoo from the source (which doesn´t work), then install S4Vectors from Bioconductor (which doesn´t work too). The following warning appears:

Warning messages:
1: In install.packages(...) :
  installation of package ‘S4Vectors’ had non-zero exit status
2: In install.packages(update[instlib == l, "Package"], l, repos = repos,  :
  installation of package ‘igraph’ had non-zero exit status

From github (devtools::install_github('iferres/pagoo')):

ERROR: dependencies ‘S4Vectors’, ‘Biostrings’, ‘GenomicRanges’ are not available for package ‘pagoo’

In case you need to know the Bioconductor version: 3.16 Now, I don´t know how to install this package. Any new ideas?

sessionInfo( )

R version 4.2.0 (2022-04-22)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Big Sur 11.6

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] BiocGenerics_0.43.0 BiocManager_1.30.18

loaded via a namespace (and not attached):
 [1] magrittr_2.0.3   splines_4.2.0    tidyselect_1.1.2 munsell_0.5.0    colorspace_2.0-3
 [6] lattice_0.20-45  R6_2.5.1         rlang_1.0.2      fansi_1.0.3      dplyr_1.0.9     
[11] tools_4.2.0      grid_4.2.0       gtable_0.3.0     nlme_3.1-157     mgcv_1.8-40     
[16] utf8_1.2.2       cli_3.3.0        ellipsis_0.3.2   tibble_3.1.7     lifecycle_1.0.1 
[21] crayon_1.5.1     Matrix_1.4-1     purrr_0.3.4      ggplot2_3.3.6    vctrs_0.4.1     
[26] glue_1.6.2       compiler_4.2.0   pillar_1.7.0     generics_0.1.2   scales_1.2.0    
[31] pkgconfig_2.0.3

Cluster annotations do not match cluster names

I just discovered this tool and am really enjoying it for processing my output data from Roary. Thank you for creating it!

I'm using roary_2_pagoo() to read my Roary output file. However, when I then run p$clusters the annotations do not match the cluster names. It seems that the "Annotations" column was somehow scrambled during the conversion to the R6 class object. I can't seem to figure out if there's a pattern to the shifts (e.g. all Annotation rows are shifted down one), but I haven't found anything.

I'm using Pagoo version 0.3.9 with R version 4.1.2.

test <- roary_2_pagoo("test.csv")

Reading csv file (roary).
Processing csv file.
Loading PgR6M class object.
Checking class.
Checking dimnames.
Creating gid (gene ids).
Checking provided cluster metadata.
Creating panmatrix.
Populating class.
Done.

test$clusters

DataFrame with 30 rows and 2 columns
cluster Annotation

1 ccdA IS200/IS605 family t..
2 ccdB hypothetical protein
3 dbpA hypothetical protein
4 faeE putative protein YjiK
5 group_103 hypothetical protein
... ... ...
26 lpfC' Antitoxin CcdA
27 pemI hypothetical protein
28 pemK hypothetical protein
29 xerD Antitoxin PemI
30 yjiK hypothetical protein

Is it possible to run panaroo_2_pagoo without removing "refound_"?

I am working with a divergent group of organisms, so I want to keep the genes refound by panaroo because I believe the core genome is being underestimated without them. My understanding is that the genes labeled "refound_" are not inherently a problem. Instead, it's the genes with the "stop" marker that are likely pseudogenes. Is there a way to run panaroo_2_pagoo and not remove all "refound" genes but to still remove "_stop" genes?

Alternatively, is it possible to use panaroo_2_pagoo and remove no genes or to make a pagoo matrix directly from a csv matrix instead of the long format data? If this isn't possible, I'll just need to convert the panaroo matrix to long format for pagoo to reformat it back into a matrix again.

Thanks for your work on this package. Pagoo is great and so easy to use!

Is there a function pagoo_2_roary ?

Hello to the pagoo team,

First, my congratulations on the great work made on the development of pagoo R library.

The main question I want to put in this forum is in the following:

Is there a easy and quick way of converting the pagoo pangenome object into the gene_presence_absence_csv output file and other important output files generated by roary?
If not, are there plans to make available?
If there are no plans to make it available, is it possible for the pagoo team to leave some quick indications on how to construct the gene_presence_absence_csv and other important roary output files in an easy and quick way from the pagoo pangenome R6 class object, PgR6MS.

I don't have much knowledge on the internal way how roary generates these output files and which ones are important to allow being used as input by other downstream softwares.

The reasoning for being useful the generation of this gene_presence_absence_csv file is in case pagoo does not make available a certain function that might be useful in the analysis of the Pangenome of a given bacterial species but other post-processing software do, the pagoo users might have an easy access to these functions.

Implement $drop(hard = TRUE)

Implement an option to completely remove an organisms from the dataset. This would allow to then add metadata to available organisms without NAs needed for dropped ones. In the current implementation user must add metadata for all organisms, included the hidden ones. This makes it unnecessarily complicated if those organisms are garbage.

Suggested implementation:

p$drop( x = 1:4, hard = TRUE)

I now think default should be hard = FALSE (current behavior), but not sure.

Add `phylo` object as metadata

A tree can be considered as pangenome metadata. Provide a method to add a phylo object to the pangenome (with $add_metadata(), and make other methods be aware of it so the can use them to improve, e.g. visualizations.

panaroo to pagoo, support for bakta annos

Hi pagoo,

love the program. was working great when using prokka annotations, however, I just tried to redo an analysis with bakta annotations and when creating the running the panaroo_2_pagoo and roary_2_pagoo was supplied with this error -

two issues I got were 1:

Reading gff file /Users/XXXX/Desktop/XXXX/XXXXXX/bakta_pano/bakta_gff//XXXXX.gff3
Error in .Call2("C_solve_user_SEW", refwidths, start, end, width, translate.negative.coord, :
solving row 61: 'allow.nonnarrowing' is FALSE and the supplied end (47314) is > refwidth

it seems to be an issue with reading in the coordinates of an annotation in the gff, I am unsure if this is due to BAKTA calling genes whose coordinates span the start and residues on the plasmid. I took out that line in the gff and it worked fine.

then I ran into this error:
Reading csv file (roary).
Processing csv file.
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/R1_polished.gff3
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/T1_polished.gff3
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/X_albilineans_FJ.1.gff3
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/X_arboricola_juglandis_CPBF1494.gff3
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/X_axonopodis_vasculorum_NCPPB796.gff3
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/X_campestris_musacearum_NCPPB4379.gff3
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/X_campestris_raphani_MAF106181.gff3
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/X_citri_citri_MN12.gff3
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/X_cucurbitae_ATCC23378.gff3
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/X_euroxanthea.gff3
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/X_euvesicatoria_alfalfae_CFBP3836.gff3
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/X_fragariae_PD885.gff3
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/X_hortorum_B07007.gff3
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/X_hyacinthi_CFBP1156.gff3
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/X_hydrangeae_GBBC2199.gff3
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/X_oryzae_oryzicola_YM15.gff3
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/X_theicola_CFBP4691.gff3
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/X_translucens_undulosa_XtLr8.gff3
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/X_vesicatoria_ATCC35937_LMG911.gff3
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/Xap_15-088.gff3
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/Xap_CITA33.gff3
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/Xap_CuR.gff3
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/Xap_IVIA2626.1.gff3
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/Xap_Xcp1.gff3
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/Xcc_03-1638-1-1.gff3
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/Xcc_GD82.gff3
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/Xhc_ICMP7383.gff3
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/Xhv_CFBP498.gff3
Reading gff file /Users/herber4/Desktop/COPPER/COPPER_MASTER/bakta_pano/bakta_gff/Xp_LH3.gff3
Error in [[<-(*tmp*, name, value = "X_albilineans_FJ.1.gff3") :
1 elements in value to replace 0 elements

is there anything you can do to help trouble shoot this?

best,

Austin

load_pangenomeRDS() fails with old third party objects

load_pangenomeRDS() expects an attribute named parent_package, which may be missing in older objects. Pagoo should be capable to dealing with these cases going backwards and loading a simple pagoo, OR/AND giving the user the ability to supply the package at load: x <- load_pangenomeRDS("pangenome.rds", pkg = "pewit")

roary_2_pagoo Error: subscript contains invalid names

Hi
I encountered the following error while trying to create R6 class object; This is the script I'm using and the corresponding error:

library(pagoo)

gffs <- list.files(pattern = "[.]gff$", recursive = TRUE, full.names = TRUE)

gpa_csv <- "/home/jason/Documents/pagoo/gene_presence_absence.csv"

p <- roary_2_pagoo(gene_presence_absence_csv = gpa_csv, gffs = gffs, sep = "__", paralog_sep = "\t")

Reading csv file (roary).
Processing csv file.
Reading gff file ./10432_62_LANL.gff
Reading gff file ./107V1216_BRAC.gff
Reading gff file ./1154_74_LANL.gff
Reading gff file ./11S_UM.gff
Reading gff file ./1346_SC.gff
Reading gff file ./1362_SC.gff
.
.
.
Reading gff file ./YB8E08_UA.gff
Reading gff file ./YN2011004_YPCDCP.gff
Reading gff file ./YN89004_YPCDCP.gff
Reading gff file ./YN97083_YPCDCP.gff
Error: subscript contains invalid names

Add Willenbrock et al. 2007 coregenome fit function

See:
https://genomebiology.biomedcentral.com/articles/10.1186/gb-2007-8-12-r267#Sec8

how to add metadata in gene_presence_absence.csv file?

As you have mentioned in your paper pca analysis of accessory gene with respect to host in this script which is given below. Please let me know if possible, how to analyze such type of metadata in detail?. I mean i want to know that how did you add metadata in the analysis or in csv file.
p$gg_pca(color = "Host", size = 4) +
theme_bw(base_size = 15) +
scale_color_brewer(palette = "Set2") https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8487088/pdf/main.pdf!

Thank you !

Remove metadata option

If there is a duplicated metadata or an extra column in it there is no way to remove or modify the elements. This means that if you save an object with wrong metadata, you have to redo every step to generate another without the wrong metadata or extra column.

library(pagoo)
toy_rds <- system.file('extdata', 'campylobacter.RDS', package = 'pagoo')
a<-load_pangenomeRDS(toy_rds)
a$organisms$country<-NULL

# Error in (function ()  : 
#   unused argument (base::quote(new("DFrame", rownames = c("1", "2", "3", "4", "5", "6", "7"), nrows = 7, listData = list(1:7, 
#   c("FR15", "FR27", "AR1", "AR8", "AR12", "CA1", "TW6"), c("2008/170h", "2012/185h", "99/801", "04/875", "06/195", "001A-
#  0374", "1830"), c(2008, 2012, 1999, 2004, 2006, 2005, 2008), c("Human", "Human", "Bovine", "Bovine", "Bovine", "Human", 
#  "Human"), c("Feces", "Blood", "Prepuce", "Fetus", "VM", "Blood", "Blood"), c("ERS672247", "ERS672259", "ERS739235", 
#  "ERS739242", "ERS739246", "ERS686652", "ERS739261")), elementType = "ANY", elementMetadata = NULL, metadata = 
#  list())))

iferres / pagoo Goto Github PK

pagoo's People

Stargazers

Watchers

Forkers

pagoo's Issues

Recommend Projects

Recommend Topics

Recommend Org