Giter Site home page Giter Site logo

theoreticalecology / s-jsdm Goto Github PK

View Code? Open in Web Editor NEW
62.0 7.0 14.0 46.62 MB

Scalable joint species distribution modeling

Home Page: https://theoreticalecology.github.io/s-jSDM/

License: GNU General Public License v3.0

R 64.15% Python 23.91% Jupyter Notebook 11.94%
species-distribution-modelling species-interactions machine-learning deep-learning gpu-acceleration

s-jsdm's Introduction

Project Status: Active – The project has reached a stable, usable state and is being actively developed. License: GPL v3 CRAN_Status_Badge R-CMD-check Publication

s-jSDM - Fast and accurate Joint Species Distribution Modeling

About the method

The method is described in Pichler & Hartig (2021) A new joint species distribution model for faster and more accurate inference of species associations from big community data, https://doi.org/10.1111/2041-210X.13687. The code for producing the results in this paper is available under the subfolder publications in this repo.

The method itself is wrapped into an R package, available under subfolder sjSDM. You can also use it stand-alone under Python (see instructions below). Note: for both the R and the python package, python >= 3.7 and pytorch must be installed (more details below).

Installing the R / Python package

R-package

Install the package via

install.packages("sjSDM")

Depencies for the package can be installed before or after installing the package. Detailed explanations of the dependencies are provided in vignette(“Dependencies”, package = “sjSDM”), source code here. Very briefly, the dependencies can be automatically installed from within R:

sjSDM::install_sjSDM(version = "gpu") # or
sjSDM::install_sjSDM(version = "cpu")

To cite sjSDM, please use the following citation:

citation("sjSDM")

Development

If you want to install the current (development) version from this repository, run

devtools::install_github("https://github.com/TheoreticalEcology/s-jSDM", subdir = "sjSDM", ref = "master")

Once the dependencies are installed, the following code should run:

Workflow

Simulate a community and fit a sjSDM model:

library(sjSDM)
## ── Attaching sjSDM ──────────────────────────────────────────────────── 1.0.4 ──

## ✔ torch <environment> 
## ✔ torch_optimizer  
## ✔ pyro  
## ✔ madgrad
set.seed(42)
community <- simulate_SDM(sites = 100, species = 10, env = 3, se = TRUE)
Env <- community$env_weights
Occ <- community$response
SP <- matrix(rnorm(200, 0, 0.3), 100, 2) # spatial coordinates (no effect on species occurences)

model <- sjSDM(Y = Occ, env = linear(data = Env, formula = ~X1+X2+X3), spatial = linear(data = SP, formula = ~0+X1:X2), se = TRUE, family=binomial("probit"), sampling = 100L)
summary(model)
## Family:  binomial 
## 
## LogLik:  -510.9816 
## Regularization loss:  0 
## 
## Species-species correlation matrix: 
## 
##  sp1  1.0000                                 
##  sp2 -0.3780  1.0000                             
##  sp3 -0.2050 -0.4070  1.0000                         
##  sp4 -0.1850 -0.3860  0.8220  1.0000                     
##  sp5  0.6820 -0.4090 -0.1240 -0.0730  1.0000                 
##  sp6 -0.3050  0.4870  0.1630  0.1510 -0.1220  1.0000             
##  sp7  0.5830 -0.1190  0.0960  0.1200  0.5520  0.2450  1.0000         
##  sp8  0.3140  0.1690 -0.5280 -0.5460  0.2330 -0.0480  0.1300  1.0000     
##  sp9 -0.0620 -0.0250  0.0840  0.0640 -0.4010 -0.3430 -0.2060 -0.1380  1.0000 
##  sp10     0.2080  0.4750 -0.7140 -0.6490  0.2540  0.1410  0.1480  0.4560 -0.2850  1.0000
## 
## 
## 
## Spatial: 
##            sp1       sp2      sp3       sp4      sp5      sp6      sp7      sp8
## X1:X2 2.103188 -4.041381 3.452883 0.2332844 2.681165 1.325118 3.126471 1.928931
##             sp9     sp10
## X1:X2 0.9001696 1.262238
## 
## 
## 
##                  Estimate Std.Err Z value Pr(>|z|)    
## sp1 (Intercept)   -0.0847  0.2671   -0.32  0.75124    
## sp1 X1             1.3854  0.5241    2.64  0.00820 ** 
## sp1 X2            -2.4736  0.4839   -5.11  3.2e-07 ***
## sp1 X3            -0.2583  0.4362   -0.59  0.55385    
## sp2 (Intercept)   -0.0145  0.2601   -0.06  0.95560    
## sp2 X1             1.2578  0.5233    2.40  0.01625 *  
## sp2 X2             0.2357  0.4909    0.48  0.63112    
## sp2 X3             0.6825  0.4302    1.59  0.11266    
## sp3 (Intercept)   -0.5653  0.2861   -1.98  0.04819 *  
## sp3 X1             1.4285  0.5099    2.80  0.00509 ** 
## sp3 X2            -0.4155  0.5096   -0.82  0.41489    
## sp3 X3            -1.1364  0.4898   -2.32  0.02034 *  
## sp4 (Intercept)   -0.1156  0.2580   -0.45  0.65406    
## sp4 X1            -1.5792  0.4921   -3.21  0.00133 ** 
## sp4 X2            -1.9313  0.5088   -3.80  0.00015 ***
## sp4 X3            -0.4306  0.4314   -1.00  0.31822    
## sp5 (Intercept)   -0.2109  0.2526   -0.83  0.40378    
## sp5 X1             0.7425  0.4843    1.53  0.12525    
## sp5 X2             0.5624  0.4582    1.23  0.21969    
## sp5 X3            -0.7171  0.4154   -1.73  0.08433 .  
## sp6 (Intercept)    0.2184  0.2707    0.81  0.41973    
## sp6 X1             2.6087  0.5552    4.70  2.6e-06 ***
## sp6 X2            -1.1176  0.5271   -2.12  0.03400 *  
## sp6 X3             0.2021  0.4461    0.45  0.65049    
## sp7 (Intercept)   -0.0719  0.2448   -0.29  0.76903    
## sp7 X1            -0.3372  0.4899   -0.69  0.49132    
## sp7 X2             0.3403  0.4328    0.79  0.43175    
## sp7 X3            -1.4822  0.4269   -3.47  0.00052 ***
## sp8 (Intercept)    0.1574  0.1625    0.97  0.33270    
## sp8 X1             0.3657  0.3158    1.16  0.24688    
## sp8 X2             0.3236  0.3102    1.04  0.29688    
## sp8 X3            -1.2363  0.2850   -4.34  1.4e-05 ***
## sp9 (Intercept)    0.0235  0.2003    0.12  0.90667    
## sp9 X1             1.4160  0.3943    3.59  0.00033 ***
## sp9 X2            -1.0606  0.3755   -2.82  0.00473 ** 
## sp9 X3             0.7943  0.3444    2.31  0.02111 *  
## sp10 (Intercept)  -0.0825  0.2076   -0.40  0.69104    
## sp10 X1           -0.5510  0.3781   -1.46  0.14505    
## sp10 X2           -1.3145  0.3777   -3.48  0.00050 ***
## sp10 X3           -0.5257  0.3590   -1.46  0.14310    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(model)
## Family:  binomial 
## 
## LogLik:  -510.9816 
## Regularization loss:  0 
## 
## Species-species correlation matrix: 
## 
##  sp1  1.0000                                 
##  sp2 -0.3780  1.0000                             
##  sp3 -0.2050 -0.4070  1.0000                         
##  sp4 -0.1850 -0.3860  0.8220  1.0000                     
##  sp5  0.6820 -0.4090 -0.1240 -0.0730  1.0000                 
##  sp6 -0.3050  0.4870  0.1630  0.1510 -0.1220  1.0000             
##  sp7  0.5830 -0.1190  0.0960  0.1200  0.5520  0.2450  1.0000         
##  sp8  0.3140  0.1690 -0.5280 -0.5460  0.2330 -0.0480  0.1300  1.0000     
##  sp9 -0.0620 -0.0250  0.0840  0.0640 -0.4010 -0.3430 -0.2060 -0.1380  1.0000 
##  sp10     0.2080  0.4750 -0.7140 -0.6490  0.2540  0.1410  0.1480  0.4560 -0.2850  1.0000
## 
## 
## 
## Spatial: 
##            sp1       sp2      sp3       sp4      sp5      sp6      sp7      sp8
## X1:X2 2.103188 -4.041381 3.452883 0.2332844 2.681165 1.325118 3.126471 1.928931
##             sp9     sp10
## X1:X2 0.9001696 1.262238
## 
## 
## 
##                  Estimate Std.Err Z value Pr(>|z|)    
## sp1 (Intercept)   -0.0847  0.2671   -0.32  0.75124    
## sp1 X1             1.3854  0.5241    2.64  0.00820 ** 
## sp1 X2            -2.4736  0.4839   -5.11  3.2e-07 ***
## sp1 X3            -0.2583  0.4362   -0.59  0.55385    
## sp2 (Intercept)   -0.0145  0.2601   -0.06  0.95560    
## sp2 X1             1.2578  0.5233    2.40  0.01625 *  
## sp2 X2             0.2357  0.4909    0.48  0.63112    
## sp2 X3             0.6825  0.4302    1.59  0.11266    
## sp3 (Intercept)   -0.5653  0.2861   -1.98  0.04819 *  
## sp3 X1             1.4285  0.5099    2.80  0.00509 ** 
## sp3 X2            -0.4155  0.5096   -0.82  0.41489    
## sp3 X3            -1.1364  0.4898   -2.32  0.02034 *  
## sp4 (Intercept)   -0.1156  0.2580   -0.45  0.65406    
## sp4 X1            -1.5792  0.4921   -3.21  0.00133 ** 
## sp4 X2            -1.9313  0.5088   -3.80  0.00015 ***
## sp4 X3            -0.4306  0.4314   -1.00  0.31822    
## sp5 (Intercept)   -0.2109  0.2526   -0.83  0.40378    
## sp5 X1             0.7425  0.4843    1.53  0.12525    
## sp5 X2             0.5624  0.4582    1.23  0.21969    
## sp5 X3            -0.7171  0.4154   -1.73  0.08433 .  
## sp6 (Intercept)    0.2184  0.2707    0.81  0.41973    
## sp6 X1             2.6087  0.5552    4.70  2.6e-06 ***
## sp6 X2            -1.1176  0.5271   -2.12  0.03400 *  
## sp6 X3             0.2021  0.4461    0.45  0.65049    
## sp7 (Intercept)   -0.0719  0.2448   -0.29  0.76903    
## sp7 X1            -0.3372  0.4899   -0.69  0.49132    
## sp7 X2             0.3403  0.4328    0.79  0.43175    
## sp7 X3            -1.4822  0.4269   -3.47  0.00052 ***
## sp8 (Intercept)    0.1574  0.1625    0.97  0.33270    
## sp8 X1             0.3657  0.3158    1.16  0.24688    
## sp8 X2             0.3236  0.3102    1.04  0.29688    
## sp8 X3            -1.2363  0.2850   -4.34  1.4e-05 ***
## sp9 (Intercept)    0.0235  0.2003    0.12  0.90667    
## sp9 X1             1.4160  0.3943    3.59  0.00033 ***
## sp9 X2            -1.0606  0.3755   -2.82  0.00473 ** 
## sp9 X3             0.7943  0.3444    2.31  0.02111 *  
## sp10 (Intercept)  -0.0825  0.2076   -0.40  0.69104    
## sp10 X1           -0.5510  0.3781   -1.46  0.14505    
## sp10 X2           -1.3145  0.3777   -3.48  0.00050 ***
## sp10 X3           -0.5257  0.3590   -1.46  0.14310    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We support other distributions:

  • Count data with Poisson:
model <- sjSDM(Y = Occ, env = linear(data = Env, formula = ~X1+X2+X3), spatial = linear(data = SP, formula = ~0+X1:X2), se = TRUE, family=poisson("log"))
  • Count data with negative Binomial (which is still experimental, if you run into errors/problems, please let us know):
model <- sjSDM(Y = Occ, env = linear(data = Env, formula = ~X1+X2+X3), spatial = linear(data = SP, formula = ~0+X1:X2), se = TR, family="nbinom")
  • Gaussian (normal):
model <- sjSDM(Y = Occ, env = linear(data = Env, formula = ~X1+X2+X3), spatial = linear(data = SP, formula = ~0+X1:X2), se = TR, family=gaussian("identity"))

Anova

ANOVA can be used to partition the three components (abiotic, biotic, and spatial):

an = anova(model)
print(an)
## Analysis of Deviance Table
## 
## Terms added sequentially:
## 
##           Deviance Residual deviance R2 Nagelkerke R2 McFadden
## Abiotic  157.95722        1177.48500       0.79394      0.1139
## Biotic   175.41278        1160.02944       0.82694      0.1265
## Spatial   17.38643        1318.05579       0.15959      0.0125
## Full     385.98836         949.45386       0.97893      0.2784
plot(an)

The anova shows the relative changes in the R2 of the groups and their intersections.

Internal metacommunity structure

Following Leibold et al., 2022 we can calculate and visualize the internal metacommunity structure (=partitioning of the three components for species and sites). The internal structure is already calculated by the ANOVA and we can visualize it with the plot method:

results = plotInternalStructure(an) # or plot(an, internal = TRUE)
## Registered S3 methods overwritten by 'ggtern':
##   method           from   
##   grid.draw.ggplot ggplot2
##   plot.ggplot      ggplot2
##   print.ggplot     ggplot2

The plot function returns the results for the internal metacommunity structure:

print(results$data$Species)
##           env         spa     codist         r2
## 1  0.17677667 0.000000000 0.16810146 0.03375475
## 2  0.08724636 0.026656011 0.18072040 0.02946228
## 3  0.12613742 0.004529856 0.21004115 0.03407084
## 4  0.16648179 0.000000000 0.15890110 0.03241345
## 5  0.08585343 0.005811074 0.16168802 0.02533525
## 6  0.18787936 0.012341719 0.11489709 0.03151182
## 7  0.10765006 0.012898782 0.13292549 0.02534743
## 8  0.12445149 0.015040332 0.06188116 0.02013730
## 9  0.17762242 0.000000000 0.04357315 0.02196159
## 10 0.08805858 0.017406001 0.13890574 0.02443703

Installation trouble shooting

If the installation fails, check out the help of ?install_sjSDM, ?installation_help, and vignette(“Dependencies”, package = “sjSDM”).

  1. Try install_sjSDM()
  2. New session, if no ‘PyTorch not found’ appears it should work, otherwise see ?installation_help
  3. If do not get the pkg to run, create an issue issue tracker or write an email to maximilian.pichler at ur.de

Python Package

pip install sjSDM_py

Python example

import sjSDM_py as fa
import numpy as np
import torch
Env = np.random.randn(100, 5)
Occ = np.random.binomial(1, 0.5, [100, 10])

model = fa.Model_sjSDM(device=torch.device("cpu"), dtype=torch.float32)
model.add_env(5, 10)
model.build(5, optimizer=fa.optimizer_adamax(0.001),scheduler=False)
model.fit(Env, Occ, batch_size = 20, epochs = 10)
# print(model.weights)
# print(model.covariance)
## Iter: 0/10   0%|          | [00:00, ?it/s]Iter: 0/10   0%|          | [00:00, ?it/s, loss=7.17]Iter: 1/10  10%|#         | [00:00,  3.88it/s, loss=7.17]Iter: 1/10  10%|#         | [00:00,  3.88it/s, loss=7.158]Iter: 1/10  10%|#         | [00:00,  3.88it/s, loss=7.149]Iter: 1/10  10%|#         | [00:00,  3.88it/s, loss=7.142]Iter: 1/10  10%|#         | [00:00,  3.88it/s, loss=7.124]Iter: 1/10  10%|#         | [00:00,  3.88it/s, loss=7.118]Iter: 1/10  10%|#         | [00:00,  3.88it/s, loss=7.112]Iter: 1/10  10%|#         | [00:00,  3.88it/s, loss=7.108]Iter: 1/10  10%|#         | [00:00,  3.88it/s, loss=7.118]Iter: 9/10  90%|######### | [00:00, 30.63it/s, loss=7.118]Iter: 9/10  90%|######### | [00:00, 30.63it/s, loss=7.144]Iter: 10/10 100%|##########| [00:00, 26.74it/s, loss=7.144]

Calculate Importance:

Beta = np.transpose(model.env_weights[0])
Sigma = ( model.sigma @ model.sigma.t() + torch.diag(torch.ones([1])) ).data.cpu().numpy()
covX = fa.covariance( torch.tensor(Env).t() ).data.cpu().numpy()

fa.importance(beta=Beta, covX=covX, sigma=Sigma)
## {'env': array([[ 1.2717709e-02,  7.8437300e-03,  6.6514793e-03, -3.3015787e-04,
##          2.3806898e-04],
##        [ 9.9158729e-05, -1.8891758e-06,  1.0537009e-03,  4.0511694e-04,
##          1.1120385e-02],
##        [ 6.1564189e-03,  5.9850062e-03,  9.2307013e-03,  5.4843356e-03,
##         -3.3683516e-04],
##        [ 1.3349474e-02,  4.3294221e-04,  1.8103119e-03,  1.4068705e-02,
##          6.6316797e-04],
##        [ 3.3953122e-05,  3.1304134e-03,  2.6658648e-03, -3.6165391e-05,
##          7.3677581e-03],
##        [ 2.7722977e-03,  1.9519718e-03,  4.8086399e-04,  3.0876237e-03,
##          1.7828522e-04],
##        [ 7.9284189e-03,  5.7881157e-04,  7.5722663e-03,  2.1802005e-06,
##          4.2433664e-03],
##        [-5.1329907e-06,  6.0144444e-03,  2.1059261e-05,  7.5954124e-03,
##          1.5537007e-03],
##        [ 7.7161851e-04,  1.7209088e-02,  4.8407568e-03,  1.8020724e-03,
##          5.6920521e-04],
##        [ 1.7108561e-02,  7.2742125e-04,  9.4651995e-04,  6.8342132e-03,
##          1.1830850e-02]], dtype=float32), 'biotic': array([0.9728792 , 0.9873236 , 0.9734804 , 0.9696754 , 0.9868381 ,
##        0.9915289 , 0.97967494, 0.9848205 , 0.9748072 , 0.9625524 ],
##       dtype=float32)}

s-jsdm's People

Contributors

caiwang0503 avatar florianhartig avatar maximilianpi avatar warriorkt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

s-jsdm's Issues

CPU dtype="float64" error

A user encountered overflow problems and the use of doubles should help here, but:

> com = simulate_SDM(env = 3L, species = 5L, sites = 100L)
> ## fit model:
> model = sjSDM(Y = com$response,env = com$env_weights, iter = 2L, dtype = "float64") 
Iter: 0/2   0%|          | [00:00, ?it/s]
 Error in py_call_impl(callable, dots$args, dots$keywords) : 
  RuntimeError: expected scalar type Float but found Double Timing stopped at: 0.018 0 0.019

Phylogeny

I got a question from a user how to include a phylogenetic distance matrix. At some point we have to finally tackle this problem.
At the moment I can think of two options:
a) phylogenetic distance matrix as a kind of species-species "prior" on the env weights
b) treat phylogenetic eigenvectors as traits and fit a fourth-corner-model (as they do in the gllvm pkg: see )

CRAN release

To do:

  • move dependencies vignette to package help
  • update description (e.g. description field)
  • update sjSDM docoumentation (add reference to Chen et. al 2018 and our preprint)
  • NA and data format see #37
  • plot.sjSDM method with an image plot of the assocations
  • prepare CRAN submission notice
  • revise anova and R-squared change anova to rely on importance
  • improve documentations, add empirical dataset
  • improve tuning function, e.g. add best model, and #83 #84
  • improve installation! (e.g. see glutonTS package or rstudio-keras)
  • remove dependencies (pyro-ppl etc.)

installation problems

I'm having trouble installing on macOS, which is weird because i previously installed without problem (and then it stopped working). I removed both conda env folders (r-sjSDM and sjSDM_env) before installing, and i only have miniconda2.

conda create --name sjSDM_env python=3.7`
conda activate sjSDM_env`
conda install pytorch torchvision cpuonly -c pytorch # cpu
devtools::install_github("https://github.com/TheoreticalEcology/s-jSDM", subdir = "sjSDM", build_vignettes = TRUE, build_manual = TRUE)

library(sjSDM)
install_sjSDM(version = "cpu", conda_python_version = "3.7") 

I get this error:

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

All requested packages already installed.

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... failed with initial frozen solve. Retrying with flexible solve.

PackagesNotFoundError: The following packages are not available from current channels:

  • torchvision
  • torch

Current channels:

To search for alternate channels that may provide the conda package you're
looking for, navigate to

https://anaconda.org

and use the search bar at the top of the page.

Installation failed... Try to install manually PyTorch (install instructions: https://github.com/TheoreticalEcology/s-jSDM
If the installation still fails, please report the following error on https://github.com/TheoreticalEcology/s-jSDM/issues
one or more Python packages failed to install [error code 1]

install issues

Hi Max, as I said, on my new system, it first didn't work at all (conda not found). I installed Anaconda with python 3.7

I now re-installed reticulate, and now it finds the python system (so, do we maybe have to increase the minimum version for reticulate? Unfortunately, not sure which version I had before).

However, now I get

PackagesNotFoundError: The following packages are not available from current channels:

  - torch
  - torchvision

Current channels:

  - https://conda.anaconda.org/conda-forge/osx-64
  - https://conda.anaconda.org/conda-forge/noarch
  - https://repo.anaconda.com/pkgs/main/osx-64
  - https://repo.anaconda.com/pkgs/main/noarch
  - https://repo.anaconda.com/pkgs/r/osx-64
  - https://repo.anaconda.com/pkgs/r/noarch

To search for alternate channels that may provide the conda package you're
looking for, navigate to

    https://anaconda.org

and use the search bar at the top of the page.

I have a call now, will try to solve this later, just to let you know.

Improve documentation

Vignettes should answer/cover (derived from user question):

  • how to evaluate model fit?
  • which learning rate should I choose?
  • what about model selection?

Anything to add? @florianhartig

nn.Sequential to DNN()

It should be possible to pass neural network objects (torch.nn.Sequentia(...)) directly to the DNN() config, e.g. it could be used to build custom NNs such as CNNs or pre-trained NNs to sjSDM

DNN support

Implementation of functional DNN api (same style as in rstudio-keras)

install to google colab / kaggle

Hey, I wanna install sjSDM to google colab. But I get this error message. I don't know if it's easy to fix but better to check with you.
Error: Failed to install 'sjSDM' from GitHub:
(converted from warning) installation of package ‘/tmp/RtmpJEcT8x/file436a79464/sjSDM_0.1.3.9000.tar.gz’ had non-zero exit status
Traceback:

  1. devtools::install_github("https://github.com/TheoreticalEcology/s-jSDM",
    . subdir = "sjSDM", auth_token = "xxxxxxx")
  2. pkgbuild::with_build_tools({
    . ellipsis::check_dots_used(action = getOption("devtools.ellipsis_action",
    . rlang::warn))
    . {
    . remotes <- lapply(repo, github_remote, ref = ref, subdir = subdir,
    . auth_token = auth_token, host = host)
    . install_remotes(remotes, auth_token = auth_token, host = host,
    . dependencies = dependencies, upgrade = upgrade, force = force,
    . quiet = quiet, build = build, build_opts = build_opts,
    . build_manual = build_manual, build_vignettes = build_vignettes,
    . repos = repos, type = type, ...)
    . }
    . }, required = FALSE)
  3. install_remotes(remotes, auth_token = auth_token, host = host,
    . dependencies = dependencies, upgrade = upgrade, force = force,
    . quiet = quiet, build = build, build_opts = build_opts, build_manual = build_manual,
    . build_vignettes = build_vignettes, repos = repos, type = type,
    . ...)
  4. tryCatch(res[[i]] <- install_remote(remotes[[i]], ...), error = function(e) {
    . stop(remote_install_error(remotes[[i]], e))
    . })
  5. tryCatchList(expr, classes, parentenv, handlers)
  6. tryCatchOne(expr, names, parentenv, handlers[[1L]])
  7. value[3L]

on.load() checks

I think we check for pytorch, but not for python / conda, right? I would add such a check.

As said, maybe best to get a general diagnostics function, which checks the system for requirements, and provides a comprehensive error message, together with the note to post this in GitHub in case the problem persists?

multiple gpu when running 'sjSDM'

Hi,
We figured out that there's no argument 'n_gpu' in the function 'sjSDM', but only in 'sjSDM_cv'. Is it possible to use multiple gpus to run 'sjSDM' function at all? If so, is it implemented yet in 'sjSDM' function?
Thanks a lot!

Better error message for missing pytorch installation?

Without installing, I got this error message when running the sjSDM

Error in sjSDM(X = com$env_weights, Y = com$response, iter = 10L) : 
  object 'fa' not found

I assume that is because of the missing pytorch install. Given that we can anticipate that a user would forget to do this, maybe provide a better error message?

error: ModuleNotFoundError: No module named 'pyro'

Dear Max,

I have the following error when installing the latest version of the package:

.onLoad failed in loadNamespace() for 'sjSDM', details: call: py_module_import(module, convert = convert) error: ModuleNotFoundError: No module named 'pyro'

However, pyro has been installed on my Windows system with Anaconda.
Maybe there is a path to change somewhere to allow proper installation.

Best wishes,

François

Importance, R^2 and p-values for single env predictors.

Hi,

maybe I just havent found it, but is there a way to see the importance, R^2 and p-value of a single environmental predictor?
e.g.
model <- sjSDM(Y = Occ, env = linear(data = Env, formula = ~X1+X2+X3), spatial = linear(data = SP, formula = ~0+X1:X2), se = TRUE, family=binomial("probit"), sampling = 100L, device = 'gpu' )

Where can I see the contribution of X3 in the model? I I got your outputs correct all predictors from the env argument are summed into A in the anova() and under env in the importance() output?!

Best regards,
Julian

NumPy array is not writeable, and PyTorch does not support non-writeable tensors

Dear colleagues,

I have the following issue when attempting to run s-jSDM with the R package:

..\torch\csrc\utils\tensor_numpy.cpp:141: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program.

I don't figure out what is the problem here, and how to resolve it.
Do you have any idea?

Best wishes,

François

Install error sh: line 1: 79672 Killed: 9

Some users get error messages such as the following during install

sh: line 1: 79672 Killed: 9               R_TESTS= '/Library/Frameworks/R.framework/Resources/bin/R' --no-save --no-restore --no-echo 2>&1 < '/var/folders/m_/zb7c8p_13k59p3zrpw84c4hm0000gq/T//Rtmpc0CxFQ/file13725537e4ec1'
ERROR: loading failed
* removing ‘/Library/Frameworks/R.framework/Versions/4.0/Resources/library/sjSDM’
Warning in install.packages :
  installation of package ‘/Users/pedro/Desktop/sjSDM_0.0.6.9000.tar.gz’ had non-zero exit status​

Migrating sjSDM code to AWS

Hello,

I'm trying to run an R script that uses 'sjSDM' to do model training on AWS SageMaker. I'm trying to run the code in a Docker container, but the installation procedure fails to install PyTorch and all the other sjSDM dependencies. I'm trying to install sjSDM and dependecies in a Dockerfile using RUN R -e "remotes::install_github('https://github.com/TheoreticalEcology/s-jSDM', subdir = 'sjSDM', dep=FALSE)" and RUN R -e "sjSDM::install_sjSDM(version = 'gpu')".

I wanted to point out this issue for anyone who tries to migrate sjSDM code to AWS, but I would also like to solve this. Thanks.

Python install

Hi, Max, I just removed the link that didn't work in fc6fb9b

Does the rest of the pip stuff work (e.g. pip install sjSDM_py), or was this changed now that the code is in the package?

dependency installation issue in 0.1.8 - missing madgrad

I reinstalled s-jSDM this morning to get the importance update. Loading the package gave this readout:

── Attaching sjSDM ──────────────────────────────────────────────────── 0.1.8 ──
✔ torch
✔ torch_optimizer
✔ pyro
✖ madgrad

Torch or other dependencies not found:
1. Use install_sjSDM() to install Pytorch and conda automatically
2. Installation trouble shooting guide: ?installation_help
3. If 1) and 2) did not help, please create an issue on https://github.com/TheoreticalEcology/s-jSDM/issues (see ?install_diagnostic)

I tried install_sjSDM() with version = "cpu" which successfully added madgrad, but removed pyro. I also tried version = c("cpu","gpu") and version = "gpu" but that didn't change anything. install_sjSDM() says that all requirements are satisfied, including pyro, but the package still won't load successfully.

install diagnostic

Hi Max, I wonder if we should merge install diagnostic with installation_help. Seems to me logical to have both functions together.

Also, possibly, I wonder if check_dependencies or so would be a better name for the function?

linear() doesn't accept formula as object

Doesn't seem possible to add formula as object to sjDM() with linear()

set.seed(42)
# simulate data
community <- simulate_SDM(sites = 100, species = 10, env = 3)
Env <- community$env_weights

Env <- as.data.frame(Env)

# make formula
form1 <- as.formula(~V1+V2+V3)
form1

Env.lin1 <- linear(data = Env, formula = form1) # this throws an error: (Error: object of type 'symbol' is not subsettable)

Env.lin2 <- linear(data = Env, formula = ~V1+V2+V3) # this is OK

Memory problems for importance() with large covariances

Question from a user (redacted for conciseness and privacy):

... we have been working on analyzing an absolutely enormous XXX dataset with s-jSDM.

Good news: given enough processors and memory, s-jSDM does handle datasets working in the tens of thousands of species pretty well.

However, I have run into a subsequent memory problem when attempting to parse the importance from the model output. I’ve looked at the code for the function and I’m pretty sure it stems from the matrix multiplication expression involving the species covariance matrix (unsurprising, given its size).

So I was wondering: have either of you run any tests on resource requirements for the importance function to see how they scale with the number of species?

readme suggestions

change to simulate_SDM(sites = 100, species = 50, env = 5)
or change to matrix(rnorm(800), 100, 2)

also, i find ternary diagrams easier to read if the three elements are on the axis (e.g. environment at the top vertex, biotic bottom left, spatial bottom right)

sjSDM_cv() Error in unserialize(node$con) : error reading from connection

I get a weird error running sjSDM_cv(). I'm using R 4.0.0. my students are running R 3.6.3 and are able to run the test cod e and also on their dataset.

so i'm wondering if it's an R 4 thing.

# sjSDM_cv()
# simulate sparse community:
com = simulate_SDM(env = 5L, species = 25L, sites = 100L, sparse = 0.5)

# tune regularization:
tune_results = sjSDM_cv(Y = com$response,
                        env = com$env_weights, 
                        tune = "random", # random steps in tune-paramter space
                        CV = 3L, # 3-fold cross validation
                        tune_steps = 25L,
                        alpha_cov = seq(0, 1, 0.1),
                        alpha_coef = seq(0, 1, 0.1),
                        lambda_cov = seq(0, 0.1, 0.001), 
                        lambda_coef = seq(0, 0.1, 0.001),
                        n_cores = 2L, # small models can be also run in parallel on the GPU
                        iter = 2L # we can pass arguments to sjSDM via ...
                        )

Error in unserialize(node$con) : error reading from connection

AIC function

I have implemented a logLik function in 80c906b ... question is if we should also implement an AIC ... I would tend towards not, because of the problem of counting df.

sjSDM::install_sjSDM(version = "cpu") seems to want pytorch

> sjSDM::install_sjSDM(version = "cpu")

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

# All requested packages already installed.

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... failed with initial frozen solve. Retrying with flexible solve.

PackagesNotFoundError: The following packages are not available from current channels:

  - torchvision
  - torch

Current channels:

  - https://conda.anaconda.org/conda-forge/osx-64
  - https://conda.anaconda.org/conda-forge/noarch
  - https://repo.anaconda.com/pkgs/main/osx-64
  - https://repo.anaconda.com/pkgs/main/noarch
  - https://repo.anaconda.com/pkgs/r/osx-64
  - https://repo.anaconda.com/pkgs/r/noarch

To search for alternate channels that may provide the conda package you're
looking for, navigate to

    https://anaconda.org

and use the search bar at the top of the page.



Installation failed... Try to install manually PyTorch (install instructions: https://github.com/TheoreticalEcology/s-jSDM
If the installation still fails, please report the following error on https://github.com/TheoreticalEcology/s-jSDM/issues
one or more Python packages failed to install [error code 1]

Traits - fourth corner model?

We could provide the option to include traits following the fourth corner model from Brown et al., 2015 (The GLLVM model does it this way).

Setting a multivariate penality (prior) on the environmental predictors would be another option (I think, Hmsc does it this way), but I think the former would be preferable because any type of penalty would interfere with the p-values.

Error message in sjSDM if pytorch not available

Hi, I just tried this out, if you run

com = simulate_SDM(env = 3L, species = 5L, sites = 100L)
model = sjSDM(Y = com$response,env = com$env_weights, iter = 10L)

you without pytorch (luckily, I can do this, as I still haven't updated), I get

 Error in reticulate::py_is_null_xptr(fa) : object 'fa' not found 
3.
reticulate::py_is_null_xptr(fa) at utils.R#84
2.
check_module() at sjSDM.R#58
1.
sjSDM(Y = com$response, env = com$env_weights, iter = 10L) 

whereas a good error message would say "pytorch not installed". I would just do the startup check also in sjSDM to check if the requirements are there.

test can't run in 0.1.8

hi there,
when i update to 0.1.8 and run the test model, it throws an error.
what happen with my mac?
Screen Shot 2021-06-17 at 7 07 11 PM

Model_sjSDM object has no attribute 'set_weights'

I get the following error :

pred = predict(model, test_X)
  # Error in py_get_attr_impl(x, name, silent) : 
  # AttributeError: 'Model_sjSDM' object has no attribute 'set_weights'

Indeed model has an attribute weights, but not an attribute set_weights.

Space

What shall we do about spatial predictors?

a) No api changes but provide an example with additional predictors in the env matrix
b) provide an extra argument in sjSDM for spatial predictors

Install "private" conda version?

Hi Max, just a follow-up to #23 - Now at least it works from a clean (= conda-free) computer. One thing that I am wondering - what happens if a user already has a conda version on their computer? At the moment, you are trying to use it, right?

Wouldn't it be safer to always install a dedicated "private" miniconda version for sjSDM?

Register importance and possibly other functions as S3 classes

Just running through Pedro's example, while having run RF before, I noted that if you load the RandomForest package before, this will create a problem

> imp = importance(model)
Error in UseMethod("importance") : 
  no applicable method for 'importance' applied to an object of class "c('sjSDM', 'linear')"

because RF registers importance as S3 class. Because of this, I think it would be safer to register all reasonably general sounding functions as S3 classes, or else use names such as sjSDM_importance (but I would prefer the former)

p-values on env components

As discussed. If faster, I would calculate the hessian per species, as one can assume that env estimates will be approximately independent across species.

sjSDM_model - hide or push?

At the moment, sjSDM_model is only / mostly? used internally to build the model. I wonder - is it distracting to have this open, and should we rather hide it? If we're not hiding it, I would add it a bit more prominently to the help and link it to other functions.

Error in py_call_impl(callable, dots$args, dots$keywords) : can't convert np.ndarray of type numpy.object_

Dear Max,

I am sorry to bother you with a new issue when using sjSDM function with device = "gpu"...

Error in py_call_impl(callable, dots$args, dots$keywords) : 
  TypeError: can't convert np.ndarray of type numpy.object_. The only supported types are: float64, float32, float16, int64, int32, int16, int8, uint8, and bool.

Detailed traceback: 
  File "C:\Program Files\R\R-3.6.2\library\sjSDM\python\sjSDM_py\model_new.py", line 207, in fit
    dataLoader = self._get_DataLoader(X, Y, SP, RE, batch_size, True, parallel)
  File "C:\Program Files\R\R-3.6.2\library\sjSDM\python\sjSDM_py\model_new.py", line 164, in _get_DataLoader
    torch.tensor(Y, dtype=torch.float32, device=torch.device('cpu')))

Error in py_call_impl: not enough values to unpack

I have the following error with sjSDM function,

Error in py_call_impl(callable, dots$args, dots$keywords) : ValueError: not enough values to unpack (expected 2, got 1)

Any idea on what could be the cause?

TypeError: type torch.cuda.FloatTensor not available

With the latest version, I have the new following error :

Error in py_call_impl(callable, dots$args, dots$keywords) : 
  TypeError: type torch.cuda.FloatTensor not available. Torch not compiled with CUDA enabled.

Detailed traceback: 
  File "C:\Program Files\R\R-3.6.2\library\sjSDM\python\sjSDM_py\model_new.py", line 171, in build
    torch.set_default_tensor_type('torch.cuda.FloatTensor')
  File "C:\ProgramData\Anaconda3\envs\r-reticulate\lib\site-packages\torch\__init__.py", line 206, in set_default_tensor_type
    _C._set_default_tensor_type(t)

Does the new version require reinstalling Torch or Cuda?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.