Giter Site home page Giter Site logo

pln-team / plnmodels Goto Github PK

View Code? Open in Web Editor NEW
52.0 7.0 19.0 110 MB

A collection of Poisson lognormal models for multivariate count data analysis

Home Page: https://pln-team.github.io/PLNmodels

License: GNU General Public License v3.0

R 84.41% C++ 14.29% TeX 1.31%
poisson-lognormal-model count-data network-inference pca multivariate-analysis r-package

plnmodels's People

Contributors

bastien-mva avatar brgew avatar ctrapnell avatar fgindraud avatar giopogg avatar jchiquet avatar julieaubert avatar maddyduran avatar mahendra-mariadassou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

plnmodels's Issues

parallel fit?

when fitting the PLNPCA up to 5 dimensions, it uses 10 cores. I wonder if we could use more cores inside each fit to speed up the process?

Definition of BIC/ICL is package dependent: we should explicitly say that it should be maximized in PLN

In PLNmodels, the BIC (and similarly EBIC, ICL, ...) is defined by loglik - 1/2 log(n) * n_param. It allows us to easily compare these criteria with the loglikelihood. However it is often defined in others packages by -2 loglik + log(n) n_param. So we should explicitly say that in PLNmodels, 'the greater BIC, the better',

This can be done in the console when calling plot.PLNfamily or added in the print method.

Vignettes are not build nor downloaded with devtools::install_github

Hi,

Installing the package from the Github repo with devtools (as recommended in the README) did not build the vignettes

devtools::install_github("jchiquet/PLNmodels", build_vignettes = TRUE)

The vignettes directory was not even downloaded from the Github repo by R. I tested a manual download of the archive and a local installation but it did not build the vignettes either.

However, the issue is solved (vignettes are properly built) when removing the default parameters of the build_opts options (as follows)

devtools::install_github("jchiquet/PLNmodels",build_opts = c("--no-resave-data"))
# instead of the following default 
# build_opts = c("--no-resave-data", "--no-manual",   "--no-build-vignettes")

Nonetheless, some dependencies from the vignettes (e.g. GGally or tidyverse) needed to be installed by the user.

Hope you wanted to know
Charlie

PLN does not work when Y is a matrix with a single column

The following code fails

library(PLNmodels)
Y1 <- matrix(rpois(10, exp(0.5)), ncol = 1)
Y2 <- matrix(rpois(10, exp(0.5)), ncol = 1)
Y  <- cbind(Y1, Y2)
colnames(Y1) <- "Y1"
colnames(Y) <- c("Y1", "Y2")
PLN(Y ~ 1) # this works
PLN(Y1 ~ 1) # this does not work
Error in matrix(0, nrow(Y), ncol(Y)) : non-numeric matrix extent

Can't extract the StARS probability for all edges

Once the stability selection has been ran, it would be convenient to extract the probabilities from StARS for any model selected (BIC, StARS etc..). However, the stability_path data frame contains numeric indexes for the nodes instead of the nodes labels.

I wrote few lines to extract such probabilities just to check if it was feasible:

Data

library(PLNmodels)
library(tidyverse)
data("mollusk")
mollusc <- prepare_data(mollusk$Abundance, mollusk$Covariate, offset = "TSS")
mollusc<- mollusc %>%
  mutate(site_new =
           fct_collapse(site,
                        Gravier = c("GGravier1", "GGravier2", "GGravier3"),
                        Negria = c("Negria1", "Negria2")
           )
  )

Nets<-PLNnetwork( Abundance ~ 0 + site_new + offset(log(Offset)),data=mollusc,
                  control_init = list(nPenalties = 40,min.ratio=1e-2))
net_BIC<-getBestModel(Nets, "BIC")
Nets$stability_selection()

or Nets <- readRDS("Nets_debug_StARS.RDS") using the RDS (and zipped) file attached.

Issue


addStARsProba<-function(PLNfamily,PLNnet,debug=FALSE){
  # Check if stability was performed
  if (anyNA(PLNfamily$stability)) stop("stability selection has not yet been performed! Use stability_selection()")
  # Get the node labels
  node.labels <- colnames(PLNnet$latent_network())
  # Fetch the stability_path table and construct a named list with 
  #  the edges identifiers as names and the StARS probability as values (for the given penalty)
  probs<-PLNfamily$stability_path %>%
    # had to round to the same digits otherwise no intersection
    filter(round(Penalty,6)==round(PLNnet$penalty,6)) %>% 
    # Create the edge sequence
    mutate(NamedEdge=paste(node.labels[as.numeric(Node1)],
                           node.labels[as.numeric(Node2)],sep="|")) %>%
    select(x = Prob, nm = NamedEdge) %>% pmap(set_names) %>% unlist # as named list
  # Get the igraph object
  g<-plot(PLNnet,plot=F,remove.isolated = F)
  # match the edge sequence and add the probabilities
  if(debug){
    list("E" = probs[igraph::as_ids(igraph::E(g))],
         "Probs"=probs) 
  } else{ probs[igraph::as_ids(igraph::E(g))]}
}

# Extract the probabilities
addStARsProba(Nets, net_BIC)
Bit|Vac Vac|Vea    <NA>    <NA> Ans|Anl Ans|An- Ans|Pio Anl|An- Anl|Pio Pic|Pis Pin|Cat Pim|Pi- Cam|Cat 
   0.80    0.65      NA      NA    0.80    1.00    0.35    0.60    0.40    0.60    0.70    0.80    0.95 
Euf|Vea 
   1.00 

Presence of some NAs, meaning some edges in the selected network are not in the stability selection.

> which(colnames(net_BIC$latent_network())=="Lit")
[1] 7
> which(colnames(net_BIC$latent_network())=="Pis")
[1] 25
> Nets$stability_path[Nets$stability_path$Edge=="7--25",]
[1] Edge    Penalty Prob    Node1   Node2  
<0 lignes> (ou 'row.names' de longueur nulle)
> Nets$stability_path[Nets$stability_path$Edge=="25--7",]
[1] Edge    Penalty Prob    Node1   Node2  
<0 lignes> (ou 'row.names' de longueur nulle)

So the numerical indexes of the nodes in the StARS data frame would not be in the same order as the initial table?

I thought about tweaking the bit of code there but it seems to depends on the edge_to_node which I was not sure what it was doing.

Thanks for considering this issue,
Charlie

Nets_debug_StARS.zip

Add dependencies that are not on CRAN?

prepare_data_from_biom and prepare_data_from_phyloseq depend respectively on biomformat and phyloseq which are not on the CRAN but on Bioconductor and we only use those packages to have access to extractor functions. Right now, I don't importFrom the extractors to limit the number of dependencies but this results in warnings when checking the package.

Should we:

  • add the bioconductor dependencies (knowing that phyloseq already depends on biomformat)
  • or ditch the two functions and only mention them in some "import data" vignettte?

Offset

Create a set_offset method to implement a few popular normalization methods: TSS, CSS, GMPR, etc.

Add options to getModel

Method getModel is very sensitive to rounding errors on param value (especially in PLNnetwork). Add support to:

  • choose model by number in the list (like networks$models[[i]] but using the getModel interface) or
  • allow approximate param value matching (I guess matching up to 1e-8 should be enough).

Fails of some tests in CRAN devel version of R due t change of class in R base

Dear maintainer,

Please see the problems shown on
https://cran.r-project.org/web/checks/check_results_PLNmodels.html.

Specifically, see the problems shown for the r-devel Debian checks.

These can be reproduced by checking with --as-cran using current
r-devel, which for now sets

R_CLASS_MATRIX_ARRAY=true

in the check environment, to the effect that

R> class(matrix(1 : 4, 2, 2))
[1] "matrix" "array"

(and no longer just "matrix" as before).

According to the R NEWS file,

For now only active when environment variable R_CLASS_MATRIX_ARRAY
is set to non-empty, but planned to be the new unconditional behavior
when R 4.0.0 is released:

matrix objects now also inherit from class "array", namely, e.g.,
class(diag(1)) is c("matrix", "array") which invalidates code
assuming that length(class(obj)) == 1, an incorrect assumption that
is less frequently fulfilled now.

S3 methods for "array", i.e., .array(), are now also
dispatched for matrix objects.

Apparently your package no longer works correctly when
class(matrix(...)) gives a vector of length two: please fix as
necessary.

See
https://developer.r-project.org/Blog/public/2019/11/09/when-you-think-class.-think-again/index.html
for more information about correctly using class() in package code.

Please correct before 2019-12-18 to safely retain your package on CRAN.

Best,
-k

Extend to distributions mix

Was wondering if the algorithms here could be extended to a mix of count/binomial/continuous variables factors estimation in a unique algorithm. Say I have two Poisson and two Gaussian and 2 binomial variables?

Add preconditionning in variational optimization for CCSAQ

Preconditionning is available in nlopt for CCSAQ and may speed up the optimizaiton process.

For the basic PLN model, all calculations have been derived for the Hessian and for the product of the Hessian with a 2 (n x p) vectors of variational parameters vec(M,S), and should also be memory efficient (same storage as the gradient actually).

Prediction in PLNmodels

Hi @mahendra-mariadassou,

Regarding the prediction function in
https://github.com/jchiquet/PLNmodels/blob/abcfa7072c488dd1037a00fa1b7a6dc7f9148585/R/PLNfit-class.R#L288

I was wondering why we use the expectancy of Z, that is exp(XB + O), for the prediction and not the expectancy (under the variational distribution) of what we denoted A in the model (which is equal to exp(O + M + X B + S/2) for PLN). It should now be possible to compute this with a single call to the V-EM step function that you wrote, even for new data...

What do yo think ?

Adjustment of the penalty matrix to include prior independent features/species

Hi!
The ranges of penalties (lambda) is currently uniform for the entire covariance matrix except the diagonal during the glasso step. Could it be possible to specify the penalties of some pairs of features (or species) within the PLNnetwork function ?

Prior sparsity could then already be included if some pairs were not of interest. Such additional feature could be done by providing the indices of the entries to be set to zeroes just like in the the glasso package, and set such pairs to a high penalty (ex. lambda_[max] or a huge number).
However glassoFast only accepts penalty matrices so a post treatment of the generated matrix of penalty to set the pairs to high penalties. Such post treatment could happen there https://github.com/jchiquet/PLNmodels/blob/bf14cc032f0cd011fd1fd4afbb2e30cccc548abe/R/PLNnetworkfit-class.R#L76

Or the user could provide the entire matrix to PLNnetwork function, but it should be the list of matrices over the range of the suggested penalties and it seems a bit more difficult to include.

Thanks,
Charlie

Make the interface more GLM-like?

Would be handy to be able to provide a formula and the data where to pick the terms so it could deal with interactions automatically. Currently I understand I need to pass df$covariate_1 etc in the call to PLN(PCA) Also if I want interactions I have to build the vector before hand which is quite cumbersome.

Ideally something like PLNPCA(formula = output ~ x1 + x2 * x3, data = df)

Network reconstruction on very sparse matrices

Hello,

I submitted a GitHub issue a few months back asking whether the PLNnetwork() function works on a matrix of Poisson distributed data that's been transformed so it's not an integer. That issue is closed and the function worked fine. However, I tried running PLNnetwork() on the matrix containing raw counts and it basically returned an identity matrix as the network, where all the values are zero except for the diagonal elements.

The raw count matrix is very sparse (the majority of non-zero observations have counts of one or two). There are also only six observations. Does PLNnetwork() automatically remove observations with low counts, similar to how a lot of ecological analyses ignore species that are only counted once (i.e., singletons)?

Again, thank you for building and maintaining this package and for responding to issues. I've already gotten a lot of use out of this package.

Best,
Will

Posterior probabilities in PLNLDA

Change predict.PLNLDAfit so it:

  • reports posterior probabilities (in addition to MAP and likelihoods)
  • allows for user-specified prior probabilities (instead of the ones recovered from the training data set).
  • computes average positions in the latent (e.g. as weighted averages of the positions conditional under each group).

Add weights parameters

I imagine it could be possible to weight each observation differently with a weight vector.
An argument weights would be handy in the PLN(PCA) calls (like in glm basically)

Message d'erreur lors d'une tentative d'installation sous Mac OS

Infos sur la session :

R version 3.4.1 (2017-06-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: OS X El Capitan 10.11.6

devtools::install_github("jchiquet/PLNmodels")
Downloading GitHub repo jchiquet/PLNmodels@master
from URL https://api.github.com/repos/jchiquet/PLNmodels/zipball/master
Installing PLNmodels
'/Library/Frameworks/R.framework/Resources/bin/R' --no-site-file --no-environ --no-save
--no-restore --quiet CMD INSTALL
'/private/var/folders/rz/5x9sk5zn3x78vx78l4q8yk100000gn/T/RtmpxDaFGu/devtools9337ae5d612/jchiquet-PLNmodels-661d206'
--library='/Library/Frameworks/R.framework/Versions/3.4/Resources/library' --install-tests

  • installing source package ‘PLNmodels’ ...
    Using NLOPT_LIBS=-L/usr/local/Cellar/nlopt/2.4.2_2/lib -lnlopt -lm
    ** libs
    clang++ -std=gnu++11 -I/Library/Frameworks/R.framework/Resources/include -DNDEBUG -I"/Library/Frameworks/R.framework/Versions/3.4/Resources/library/Rcpp/include" -I"/Library/Frameworks/R.framework/Versions/3.4/Resources/library/RcppArmadillo/include" -I/usr/local/include -fopenmp -fPIC -Wall -g -O2 -c RcppExports.cpp -o RcppExports.o
    clang: error: unsupported option '-fopenmp'
    make: *** [RcppExports.o] Error 1
    ERROR: compilation failed for package ‘PLNmodels’
  • removing ‘/Library/Frameworks/R.framework/Versions/3.4/Resources/library/PLNmodels’
    Installation failed: Command failed (1)

Use the pipe "|" separator of edges instead of "--"

Minor issue here, but if #27 modifies the lines close to L150, it could be useful to change as well the separator from "--" to "|" L155.
Igraph uses the "--" to "print" the edges but the edge identifiers are coded with "|" instead to provide both directed and undirected and it is useful for subsetting.

library(igraph)
g<-graph_from_literal( Alice-Bob-Cecil-Alice, Daniel-Cecil-Eugene, Cecil-Gordon )
> E(g)
+ 6/6 edges from dbcead1 (vertex names):
[1] Alice--Bob    Alice--Cecil  Bob  --Cecil  Cecil--Daniel Cecil--Eugene Cecil--Gordon
> as_ids(E(g))
[1] "Alice|Bob"    "Alice|Cecil"  "Bob|Cecil"    "Cecil|Daniel" "Cecil|Eugene" "Cecil|Gordon"

Flexible offset computations

Change compute_offset so that it can compute "block-wise" offsets (for example one for bacterial taxa and one for fungal taxa).

Samples with NA in covariates are silently removed

Hi again,

I just noticed that some samples were removed if NA were present in the covariates used (reprex below). Perhaps some users would like to know at least as a warning, like in the prepare_data function, before the #8 is solved, I guess. Otherwise, one can still check before including them.

data("trichoptera")
trichoptera <- prepare_data(trichoptera$Abundance, trichoptera$Covariate)
trichoptera$Pressure[29:32]<-NA
fit <- PLN(Abundance ~ 1 + Pressure + Temperature, data = trichoptera)

> dim(fitted(fit))
[1] 45 17
> dim(trichoptera$Abundance)
[1] 49 17

> data.frame(
+ fitted   = as.vector(fitted(fit)),
+ observed = as.vector(trichoptera$Abundance)
+ )
Error in data.frame(fitted = as.vector(fitted(fit)), observed = as.vector(trichoptera$Abundance)) : 
  les arguments impliquent des nombres de lignes différents : 765, 833

Fisher matrix for PLNPCA

Check whether Fisher matrix computation is correct for PLNPCA (matrices have different dimensions than in standard PLN, some matrices maybe non conformable).

improve doc/vignette on PCA

I managed to run a PLNPCA on my data but I m lost at how to get the following elements:
rotation matrix?
covariates regressors coefficients values?
Principal components values?

my model is something like PLNPCA(pois_matrix ~ log_exposure_time + numeric_covariate)

plot_network fails when latent_network is of class dsyMatrix

The problem comes from igraph::graph_from_adjacency_matrix which expects Matrix objects to be sparse.

## MWE
x <- new("dsyMatrix"
    , x = c(0, 0.0830239291515631, 0.124614466958739, 0.0830239291515631, 
0, 0.169519822826263, 0.124614466958739, 0.169519822826263, 0
)
    , Dim = c(3L, 3L)
    , Dimnames = list(c("ARTHAUD Nathalie", "ASSELINEAU François", "CHEMINADE Jacques"
), c("ARTHAUD Nathalie", "ASSELINEAU François", "CHEMINADE Jacques"
))
    , uplo = "U"
    , factors = list()
)

igraph::graph_from_adjacency_matrix(x)

This can be fixed by forcing latent_network to output a sparse Matrix.

Modify predict method for PLNPCA

Change predict method PLNPCA to easily handle new data by (i) estimating position in the latent space (without changing others parameters) and (ii) using those positions to predict class.

PLN LDA can't plot when the grouping variable has only two levels

When running PLN LDA when the grouping variable (here Group) has only two levels (TRUE,FALSE), the visualisation of the correlation circle is not implemented yet and raise the following error:

Error in combn(axes, 2, simplify = FALSE) : n < m

Here is a MWE:

library(ade4)
data("trichometeo")
trichometeo$meteo$Group <- trichometeo$meteo$Precip.Tot > 0

X<-as.matrix(trichometeo$fau)
G<-as.factor(trichometeo$meteo$Group)
library(PLNmodels)
LDA<-PLNLDA(X,G)
LDA$plot_LDA()

PLNnetwork_stabs should be a method of PLNnetworkfamily

The resampling procedure implemented in PLNnetwork_stabs relies on the sample data (responses, covariates, offsets) and parameters (penalties).

To avoid bad specifications, useless copies, I suggest to pass this function as a method of PLNnetworkfamily.

This is a prior requirement for answering Issue 5, that is, implementaiton of StARS.

Problem when forcing the penalties in PLNnet

I get an error when trying to fix the penalties for PLNnetwork, even when reusing the penalties provided by default:

PLNnet = PLNnetwork(Y ~ 1);
Initialization...
Adjusting 30 PLN with sparse inverse covariance estimation
Joint optimization alternating gradient descent and graphical-lasso
sparsifying penalty = 0.593625
Post-treatments
DONE!
lambda = PLNnet$penalties

lambda = exp(seq(1.5, -1.5, length.out=50))

PLNnet = PLNnetwork(Y ~ 1, penalties=lambda)
Initialization...
Adjusting 30 PLN with sparse inverse covariance estimation
Joint optimization alternating gradient descent and graphical-lasso
Error in if ((convergence[iter] < control$ftol_out


Plot method for PLN network

Salut Julien,

je continue mon exploration de PLNnetwork. J'ai l'impression qu'il y a une blague dans l'option la fonction plot() en présence de covariables. La figure en bas à gauche ne contient qu'une colonne alors qu'elle devrait en comprendre autant que de covariables.

PLNLDA plot fails with one dimension (aka with a 2-level grouping factor)

The following code runs ok:

library(PLNmodels)
data("mollusk")
mollusc <- prepare_data(mollusk$Abundance, mollusk$Covariate, offset = "TSS")
mollusc$Trt<-sample(c("A","B"),size = nrow(mollusc),replace = T)
LDA_TEST <- PLNLDA(Abundance ~ 1 + site + offset(log(Offset)), mollusc, grouping = mollusc$Trt)

However, the plot method fails with

plot(LDA_TEST)
Don't know how to automatically pick scale for object of type function. Defaulting to continuous.
Erreur : All columns in a tibble must be 1d or 2d objects:
* Column `colour` is function
Call `rlang::last_error()` to see a backtrace

And the backtrace corresponds to:

<error>
message: All columns in a tibble must be 1d or 2d objects:
* Column `colour` is function
class:   `rlang_error`
backtrace:
  1. graphics::plot(LDA_TESTdiag)
  2. PLNmodels:::plot.PLNLDAfit(LDA_TESTdiag)
  3. x$plot_LDA(nb_axes = nb_axes, var_cols = var_cols, plot = plot)
  4. gridExtra::arrangeGrob(...)
  5. base::lapply(grobs[toconv], ggplot2::ggplotGrob)
  6. ggplot2:::FUN(X[[i]], ...)
  9. ggplot2:::ggplot_build.ggplot(x)
 10. ggplot2:::by_layer(function(l, d) l$compute_aesthetics(d, plot))
 11. ggplot2:::f(l = layers[[i]], d = data[[i]])
 12. l$compute_aesthetics(d, plot)
 13. ggplot2:::f(..., self = self)
 14. ggplot2:::as_gg_data_frame(evaled)
 17. tibble:::as_tibble.list(x)
 18. tibble:::lst_to_tibble(x, .rows, .name_repair, col_lengths(x))
 19. tibble:::check_valid_cols(x)
Call `rlang::last_trace()` to see the full backtrace

It seems to incriminate the tibble.

dimnames in Post-Treatment (Matrix, matrix or data.frame objects)

Salut Julien,

je copie ci-dessous un bout de code dans lequel PLN plante (1e appel "PLN(Y ~ X)") mais ne plante plus si je nomme les colonnes de la matrice des réponses (2e appel "PLN(Y ~ X)"). J'ai le même plantage en simulant Y avec rPLN (3e appel "PLN(Y ~ X)").

S.

> rm(list=ls())
> library(PLNmodels)
> n = 10; d = 3; p = 10
> X = matrix(rnorm(n*d), n, d)
> Y = matrix(rpois(n*p, 1), n, p)

> PLN(Y ~ X)

 Initialization...
 Adjusting a PLN model with full covariance model
 Post-treatments...Error in dimnamesGets(x, value) : 
  invalid dimnames given for “dgCMatrix” object

> colnames(Y) = paste0('Y', 1:p)
> PLN(Y ~ X)

 Initialization...
 Adjusting a PLN model with full covariance model
 Post-treatments...
 DONE!
A multivariate Poisson Lognormal fit with full covariance model.
==================================================================
 nb_param   loglik      BIC     ICL R_squared
       95 -100.612 -209.985 -73.565     0.007
==================================================================
* Useful fields
    $model_par, $latent, $var_par, $optim_par
    $loglik, $BIC, $ICL, $loglik_vec, $nb_param, $criteria
* Useful S3 methods
    print(), coef(), sigma(), vcov(), fitted(), predict(), standard_error()

> Y = rPLN (n=n, X%*%rnorm(d), Sigma = diag(p))
> PLN(Y ~ X)

 Initialization...
 Adjusting a PLN model with full covariance model
 Post-treatments...Error in dimnamesGets(x, value) : 
  invalid dimnames given for “dgCMatrix” object

Error in common_samples(counts, covariates) : Incompatible dimensions

Hello Julien,
I tried to create the list input in order to use PLNmodels and I encountered some problems.

I have two dataframes :

head(Covariante)

Prec0city Rep G L S M T MG B0u1dFree Year
1 1 1 1 1 0 1 1 1 0 2017
2 1 1 1 1 1 1 1 1 0 2017
3 1 1 1 1 1 1 0 1 0 2017
4 1 1 1 1 1 1 1 1 0 2017
5 1 1 1 1 1 0 1 1 0 2017
6 1 1 1 1 1 1 1 1 0 2017

head(Abundance)

U1k10w1_75 U1k10w1_34 U1k10w1_27 U1k10w1_111 U1k10w1_100 U1k10w1_6 U1k10w1_56 U1k10w1_48 U1k10w1_60 U1k10w1_68
4023 21132 0 18661 0 97939 11972 7695 4804 5309
5544 15848 0 15970 0 86601 10622 7005 4112 4016
3692 18743 0 16992 0 83510 12104 6355 4135 5051
U1k10w1_69 U1k10w1_115 U1k10w1_31 U1k10w1_35 U1k10w1_49 U1k10w1_112 U1k10w1_130 U1k10w1_37 U1k10w1_32
9119 11325 82049 23164 10984 16138 0 20213 18832
7458 0 55167 20371 10559 13495 0 18999 20433
8679 9141 48772 22972 10898 14701 10551 19786 18824
...
...
...

I created a list of these dataframes :
liste=list(Abundance=Abundance, Covariante=Covariante)

str(liste, max.level = 1)

List of 2
$ Abundance :'data.frame': 444 obs. of 256 variables:
$ Covariante:'data.frame': 444 obs. of 10 variables:

names(liste$Covariante)

[1] "Prec0city" "Rep" "G" "L" "S" "M" "T" "MG" "B0u1dFree"
[10] "Year"

names(liste$Abundance)

[1] "U1k10w1_75" "U1k10w1_34" "U1k10w1_27" "U1k10w1_111" "U1k10w1_100" "U1k10w1_6" "U1k10w1_56"
[8] "U1k10w1_48" "U1k10w1_60" "U1k10w1_68" "U1k10w1_69" "U1k10w1_115" "U1k10w1_31" "U1k10w1_35"
[15] "U1k10w1_49" "U1k10w1_112" "U1k10w1_130" "U1k10w1_37" "U1k10w1_32"
...
...

So I think here I have the good entries for the function prepare_data, so I tested :

data_prepare <- prepare_data(counts = liste$Abundance, covariates = liste$Covariate)

Error in common_samples(counts, covariates) : Incompatible dimensions
In addition: Warning message:
In common_samples(counts, covariates) :
There are no matching names in the count matrix and the covariates data.frame.
Function will proceed assuming:

  • samples are in the same order;
  • samples are rows of the abundance matrix.

My samples are in the same order and samples are rows of the abundance matrix, so no problem with the warnings. But I don't understand the error :

Error in common_samples(counts, covariates) : Incompatible dimensions

I have the same number of rows, so what is the problem of incompatible dimensions ?

Thank you for your help.
Best,
Amandine

Does the method work for Poisson distributed data that's not an integer?

Hello, I enjoyed reading your paper and would like to use your method. I have to duel boot my mac to run ubuntu before I can install and run your software though, so I wanted to ask my question before I try yo get your software running.

My data consists of Poisson distributed count data that has been divided by a constant to account for a covariate. Does your approach only work for integer count data? Or can it work with transformed count data that's no longer an integer.

Best,
Will

Inversion issue in fisher information matrix

In the last version (which compute approximated standard error of the coefficients), I tried

library(PLNmodels)
data(mollusk)
out <- PLNPCA(Abundance ~ 1 + log(offset(duration)) + site, data = mollusk, rank = 5)

which fails with

Error in .local(a, b, ...) : 
  cs_lu(A) failed: near-singular A (or out of memory) 
15..local(a, b, ...) 
14.solve(.) 
13.solve(.) 
12.function_list[[i]](value) 
11.freduce(value, `_function_list`) 
10.`_fseq`(`_lhs`) 
9.eval(quote(`_fseq`(`_lhs`)), env, env) 
8.eval(quote(`_fseq`(`_lhs`)), env, env) 
7.withVisible(eval(quote(`_fseq`(`_lhs`)), env, env)) 
6.fim %>% solve %>% diag %>% sqrt %>% matrix(nrow = self$d) %>%  t() at PLNfit-class.R#298
5.self$compute_standard_error() at PLNfit-class.R#190
4.super$postTreatment(responses, covariates, offsets, weights) at PLNPCAfit-class.R#122
3.model$postTreatment(self$responses, self$covariates, self$offsets,  self$weights) at PLNfamily-class.R#57
2. myPCA$postTreatment() at PLNPCA.R#64
1. PLNPCA(Abundance ~ 1 + log(offset(duration)) + site, data = mollusk, 
    rank = 5) 

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.