Giter Site home page Giter Site logo

consbiol-unibern / sdmtune Goto Github PK

View Code? Open in Web Editor NEW
23.0 23.0 7.0 202.69 MB

Performs Variables selection and model tuning for Species Distribution Models (SDMs). It provides also several utilities to display results.

Home Page: https://consbiol-unibern.github.io/SDMtune/

License: Other

R 93.64% CSS 0.35% C++ 0.53% JavaScript 3.47% HTML 0.12% TeX 1.89%
hyperparameter-tuning species-distribution-modelling variable-selection

sdmtune's People

Contributors

sgvignali avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

sdmtune's Issues

[Bug]: The package is not accepting new blockCV version (3.0)

Describe the bug

Using the new version of blockCV to create spatial blocks with cv_spatial, leads to an error when training the model, follows:

Steps to reproduce the bug

library(SDMtune)

# Generate spatial blocks
sp_range <- cv_spatial_autocor(r = predictors, 
                           num_sample = 5000)

spatial_blocks <- cv_spatial(x = pa_data,column = "occ",
                  size = 120798, hexagon = TRUE, 
                  selection = "systematic",
                  iteration = 50)
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          # Train model
model <- train(method = "Maxent", data = data, folds = spatial_blocks, reg = 1, iter = 500, seed = 777)

Session information

model <- train(method = "Maxent", data = data, folds = spatial_blocks, reg = 1, iter = 500, seed = 777)

Error in `.convert_folds()`:
! Folds object format not allowed.
Run `rlang::last_error()` to see where the error occurred.

> spatial_blocks 
[1] "cv_spatial"

> rlang::last_error()
<error/rlang_error>
Error in `.convert_folds()`:
! Folds object format not allowed.
---
Backtrace:
 1. SDMtune::train(...)
 2. SDMtune:::.convert_folds(folds, data)
Run `rlang::last_trace()` to see the full context

Additional information

No response

Reproducible example

  • I have done my best to provide the steps to reproduce the bug

[Bug]: `modelReport` using RF model fails due to mismatch in levels of categorical variables

Describe the bug

Hi there! I am using SDMtune version 1.3.1 and I tried using the modelReport function with a Random Forest model and I get the same error reported in closed issues #11 and #8:

Error in `predict.randomForest()`:
! Type of predictors in new data do not match that of the training data.

I understand that this issue was addressed in issue #8 by adding a factors parameter to the modelReport function, where the levels of the categorical variables included in the model could be provided. However, this parameter is not available in version 1.3.1, I checked the documentation for this function, as well as the source code, and it is definitely not there.

It would be great if I could get some ideas on how to address this issue.

Steps to reproduce the bug

library(SDMtune)

files <- list.files(path = file.path(system.file(package = "dismo"), "ex"),
                    pattern = "grd",
                    full.names = TRUE)

predictors <- terra::rast(files)

# Prepare presence and background locations
p_coords <- virtualSp$presence
bg_coords <- virtualSp$background

# Create SWD object
data <- prepareSWD(species = "Virtual species",
                   p = p_coords,
                   a = bg_coords,
                   env = predictors,
                   categorical = "biome")

# Split presence locations in training (80%) and testing (20%) datasets
datasets <- trainValTest(data,
                         test = 0.2,
                         only_presence = TRUE)
train <- datasets[[1]]
test <- datasets[[2]]

# Train a model
model <- train(method = "RF",
               data = train)

#Produce report
modelReport(model, folder = "test", test = test,
            response_curves = T, only_presence = TRUE, jk = TRUE,
            permut = 2)

Session information

R version 4.2.2 (2022-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.1 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] SDMtune_1.3.1

loaded via a namespace (and not attached):
 [1] tidyselect_1.2.0     terra_1.7-29         xfun_0.39            bslib_0.4.2         
 [5] lattice_0.20-45      colorspace_2.1-0     vctrs_0.6.5          generics_0.1.3      
 [9] htmltools_0.5.5      viridisLite_0.4.2    yaml_2.3.7           utf8_1.2.4          
[13] rlang_1.1.2          pillar_1.9.0         jquerylib_0.1.4      withr_2.5.2         
[17] glue_1.6.2           sp_1.6-0             plyr_1.8.8           lifecycle_1.0.4     
[21] stringr_1.5.0        munsell_0.5.0        gtable_0.3.4         ragg_1.2.4          
[25] rvest_1.0.3          raster_3.6-20        codetools_0.2-18     kableExtra_1.3.4    
[29] evaluate_0.21        labeling_0.4.3       knitr_1.43           fastmap_1.1.1       
[33] fansi_1.0.6          highr_0.10           Rcpp_1.0.11          scales_1.3.0        
[37] cachem_1.0.8         webshot_0.5.4        jsonlite_1.8.4       farver_2.1.1        
[41] systemfonts_1.0.4    textshaping_0.3.6    ggplot2_3.4.4        digest_0.6.31       
[45] stringi_1.7.12       dplyr_1.1.2          dismo_1.3-9          grid_4.2.2          
[49] cli_3.6.2            tools_4.2.2          magrittr_2.0.3       sass_0.4.6          
[53] tibble_3.2.1         randomForest_4.7-1.1 pkgconfig_2.0.3      xml2_1.3.3          
[57] rmarkdown_2.21       svglite_2.1.0        httr_1.4.6           rstudioapi_0.15.0   
[61] plotROC_2.3.0        R6_2.5.1             compiler_4.2.2

Additional information

No response

Reproducible example

  • I have done my best to provide the steps to reproduce the bug

Differences in variable importance when passing a SDMmodelCV object compared to passing each of the models inside the SDMmodelCV object

Hello,

I'm training a maxnet model with CV to model the occurrence of a disease. I have noticed an important difference in variables importance when I pass the SDMmodelCV object in comparison to passing each of the models inside the SDMmodelCV object to the function varImp

Here the commands and results:

# crossvalidation with train data ( AUC value will be averaged across the different models )
model <- train("Maxnet", p = presence, a = bg_model, fc = "lh", reg = 0.5, seed = 37, rep = 3)

# permutation importance
vi_maxnet <- varImp(model)
vi_maxnet

                           Variable Permutation_importance    sd
1         LC08_2014_2015_cva_change                   17.6 0.001
2  LC08_2014_2015_cva_change_dist_2                   10.5 0.001
3  LC08_2014_2015_cva_change_dist_1                    9.9 0.001
4                 rf_class_20140719                    9.9 0.001
5  LC08_2014_2015_cva_change_dist_4                    9.6 0.000
6  LC08_2014_2015_cva_change_dist_3                    9.2 0.000
7          LC08_2014_2015_cva_angle                    8.6 0.000
8           LC08_20140719_tasscap.3                    8.4 0.000
9           LC08_20140719_tasscap.2                    8.3 0.000
10          LC08_20140719_tasscap.4                    8.0 0.001

Here, the results of passing each of the 3 models in the object:

varImp(model@models[[1]])
                           Variable Permutation_importance    sd
1  LC08_2014_2015_cva_change_dist_4                   28.8 0.024
2           LC08_20140719_tasscap.4                   15.9 0.035
3  LC08_2014_2015_cva_change_dist_2                   15.5 0.019
4  LC08_2014_2015_cva_change_dist_1                   15.4 0.025
5                 rf_class_20140719                    8.1 0.014
6         LC08_2014_2015_cva_change                    8.0 0.014
7           LC08_20140719_tasscap.2                    3.4 0.005
8          LC08_2014_2015_cva_angle                    2.4 0.006
9  LC08_2014_2015_cva_change_dist_3                    2.2 0.007
10          LC08_20140719_tasscap.3                    0.5 0.002

varImp(model@models[[2]])
                           Variable Permutation_importance    sd
1  LC08_2014_2015_cva_change_dist_4                   40.8 0.023
2  LC08_2014_2015_cva_change_dist_1                   16.2 0.031
3  LC08_2014_2015_cva_change_dist_2                   15.3 0.026
4                 rf_class_20140719                    7.7 0.011
5           LC08_20140719_tasscap.2                    7.4 0.018
6         LC08_2014_2015_cva_change                    4.4 0.011
7           LC08_20140719_tasscap.4                    3.8 0.014
8  LC08_2014_2015_cva_change_dist_3                    3.0 0.005
9          LC08_2014_2015_cva_angle                    0.7 0.004
10          LC08_20140719_tasscap.3                    0.6 0.004

varImp(model@models[[3]])
                           Variable Permutation_importance    sd
1  LC08_2014_2015_cva_change_dist_4                   53.4 0.030
2  LC08_2014_2015_cva_change_dist_1                   15.5 0.020
3  LC08_2014_2015_cva_change_dist_2                   13.1 0.015
4           LC08_20140719_tasscap.2                    5.6 0.009
5                 rf_class_20140719                    5.4 0.007
6           LC08_20140719_tasscap.4                    3.4 0.009
7          LC08_2014_2015_cva_angle                    3.1 0.006
8  LC08_2014_2015_cva_change_dist_3                    0.5 0.001
9           LC08_20140719_tasscap.3                    0.1 0.000
10        LC08_2014_2015_cva_change                    0.0 0.000

The result of the first command and the importances obtained do not seem to be an average (as I would expect) of the importance of those variables within each of the models in the SDMmodelCV object. How are they estimated and how should they be interpreted then? Any hints are more than welcome.

Thanks in advance

Problems running the spatial cross validation examples

I am trying to run the spatial cross validation examples but they give me an error:

files <- list.files(path = file.path(system.file(package = "dismo"), "ex"), pattern = "grd", full.names = TRUE)

predictors <- raster::stack(files)

help(virtualSp)

p_coords <- virtualSp$presence
bg_coords <- virtualSp$background

data <- prepareSWD(species = "Virtual species", p = p_coords, a = bg_coords, env = predictors, categorical = "biome")

library(ENMeval)
block_folds <- get.block(occs = data@coords[data@pa == 1, ],
bg = data@coords[data@pa == 0, ])

model <- SDMtune::train(method = "Maxent", data = data, fc = "l", reg = 0.8, folds = block_folds)

Error in .convert_folds(folds, data) : Folds object format not allowed!

modelReport can't find ROC curve file and fails

When I try to use the modelReport function, everything seems to be working fine until after "Writing model settings...", when I get:

File ./plots/ROC_curve.png not found in resource path
Error: pandoc document conversion failed with error 99
In addition: Warning message:
In matrix(as.numeric(d)) : NAs introduced by coercion

[Bug]: HTML file output from modelReport does not bring up photos

Describe the bug

Photos of graphs will not come up if the folder containing the model is moved from its original location. In the HTML file, I will click a photo of a graph and it will pop up saying "ERR_FILE_NOT_FOUND" only after moving the directory containing the graphs.

Steps to reproduce the bug

library(SDMtune)

modelReport(
  model = object_model,
  folder = "folder",
  test = object_test,
  type = "cloglog", # default output type
  response_curves = TRUE,
  jk = TRUE,
  verbose = TRUE
  )

Session information

R version 4.2.3 (2023-03-15 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.utf8  LC_CTYPE=English_United States.utf8    LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                           LC_TIME=English_United States.utf8    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] terra_1.7-46     sf_1.0-14        plotROC_2.3.0    kableExtra_1.3.4 dismo_1.3-14     raster_3.6-23    sp_2.0-0         ENMeval_2.0.4   
 [9] magrittr_2.0.3   rJava_1.0-6      devtools_2.4.5   usethis_2.2.2    here_1.0.1       lubridate_1.9.2  forcats_1.0.0    stringr_1.5.0   
[17] dplyr_1.1.2      purrr_1.0.2      readr_2.1.4      tidyr_1.3.0      tibble_3.2.1     ggplot2_3.4.3    tidyverse_2.0.0  SDMtune_1.3.1   

loaded via a namespace (and not attached):
 [1] fs_1.6.3           webshot_0.5.5      httr_1.4.7         rprojroot_2.0.3    bslib_0.5.1        tools_4.2.3        profvis_0.3.8     
 [8] utf8_1.2.3         R6_2.5.1           KernSmooth_2.23-20 DBI_1.1.3          colorspace_2.1-0   urlchecker_1.0.1   withr_2.5.0       
[15] tidyselect_1.2.0   prettyunits_1.1.1  processx_3.8.2     compiler_4.2.3     textshaping_0.3.6  cli_3.6.1          rvest_1.0.3       
[22] xml2_1.3.5         labeling_0.4.3     sass_0.4.7         scales_1.2.1       classInt_0.4-10    proxy_0.4-27       callr_3.7.3       
[29] systemfonts_1.0.4  digest_0.6.33      rmarkdown_2.25     svglite_2.1.1      pkgconfig_2.0.3    htmltools_0.5.5    sessioninfo_1.2.2 
[36] highr_0.10         fastmap_1.1.1      htmlwidgets_1.6.2  rlang_1.1.1        rstudioapi_0.15.0  shiny_1.7.5        farver_2.1.1      
[43] jquerylib_0.1.4    generics_0.1.3     jsonlite_1.8.7     Rcpp_1.0.11        munsell_0.5.0      fansi_1.0.4        lifecycle_1.0.3   
[50] stringi_1.7.12     yaml_2.3.7         plyr_1.8.8         pkgbuild_1.4.2     grid_4.2.3         promises_1.2.1     crayon_1.5.2      
[57] miniUI_0.1.1.1     lattice_0.20-45    hms_1.1.3          knitr_1.44         ps_1.7.5           pillar_1.9.0       codetools_0.2-19  
[64] pkgload_1.3.2.1    glue_1.6.2         evaluate_0.21      remotes_2.4.2.1    vctrs_0.6.3        tzdb_0.4.0         httpuv_1.6.11     
[71] foreach_1.5.2      gtable_0.3.4       cachem_1.0.8       xfun_0.39          mime_0.12          xtable_1.8-4       e1071_1.7-13      
[78] later_1.3.1        ragg_1.2.5         class_7.3-21       viridisLite_0.4.2  iterators_1.0.14   memoise_2.0.1      units_0.8-4       
[85] timechange_0.2.0   ellipsis_0.3.2

Additional information

No response

Reproducible example

  • I have done my best to provide the steps to reproduce the bug

Error in `optimizeModel` when passing a SDMmodelCV object

Hello,

I'm following the workflow in https://consbiol-unibern.github.io/SDMtune/articles/articles/tune_hyperparameters.html by adapting it to a particular case for modeling occurrence of a disease. These are the commands I've used:

# train a maxnet model with CV (presences are very few as to split the dataset)
model <- train("Maxnet", p = presence, a = bg_model, 
               fc = "lh", reg = 0.5, seed = 37, rep = 3)

# reduce number of vars in the model according to importance
reduced_var_model <- reduceVar(model, th = 9, metric = "auc", test = TRUE, 
                                     permut = 1, use_jk = TRUE)

# optimize model with genetic algorithm
h <- list(a = seq(4000, 9000, 500), reg = seq(0.2, 5, 0.2), fc = c("l", "lq", "lh", "lp", "lqp", "lqph"))
opt_model <- optimizeModel(reduced_var_model, 
                            hypers = h, 
                            metric = "auc", 
                            bg4test = bg, 
                            pop = 20, 
                            gen = 5, 
                            keep_best = 0.4, 
                            keep_random = 0.2, 
                            mutation_chance = 0.4, 
                            seed = 65651)

So far so good, until the optimization step that I get:

Cross Validation [========================] 100% in 00:00:22
[...]
Cross Validation [========================] 100% in 00:00:12
Optimize Model [=========================>]  98% in 00:06:09
Error in .create_sdmtune_output(models, metric, train_metric, val_metric) : 
  no slot of name "model" for this object of class "SDMmodelCV"

However, the optimizeModel function is supposed to accept objects of type SDMmodelCV according to the manual page. What could be the problem?

[BUG] modelReport() using RF method: type of predictors in new data do not match that of the training data.

Dear Sergio et al.,

I've been trying out SDMtune, and I really like the streamlined analysis approach, visual feedback, and the genetic algorithm for reducing the hyperparameter search space. Good job!

Today I experimented with different model methods, and all works fine so far with the Maxnet, Maxent, BRT and ANN methods. However, there is an issue with the RF method, see BUG report below.

The same error appears using my own data, after variable selection, hyperparameter tuning and model parsimony optimization. The error message suggests that predict.randomForest() cannot handle the passed argument newdata, but I couldn't figure out what happens.

Am I doing something wrong? Any help would be warmly appreciated.

Many thanks and best wishes from Zurich,
Simon

Describe the bug
modelReport() with the RF method cannot write predicted distribution map using the default virtualSp dataset.

To Reproduce

library(SDMtune)

# Acquire environmental variables
files <- list.files(path = file.path(system.file(package = "dismo"), "ex"),
                    pattern = "grd", full.names = TRUE)
predictors <- raster::stack(files)

# Prepare presence and background locations
p_coords <- virtualSp$presence
bg_coords <- virtualSp$background

# Create SWD object
data <- prepareSWD(species = "Virtual species", p = p_coords, a = bg_coords,
                   env = predictors, categorical = "biome")

# Split presence locations in training (80%) and testing (20%) datasets
datasets <- trainValTest(data, test = 0.2, only_presence = TRUE)
train <- datasets[[1]]
test <- datasets[[2]]

# Train a model using the RF method
model <- train(method = "RF", data = train)

# Create the report
modelReport(model, type = "cloglog", folder = "testfolder", test = test,
            response_curves = FALSE, only_presence = TRUE, jk = TRUE,
            env = predictors, permut = 2)

โ”€โ”€ Model Report - method: RF โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Virtual species โ”€โ”€
โœ“ Saving files...
โœ“ Plotting ROC curve...
โœ“ Computing thresholds...
- Predicting distribution map...Quitting from lines 113-121 (modelReport.Rmd) 
Error in predict.randomForest(object@model, data, type = "prob") : 
  Type of predictors in new data do not match that of the training data.

Expected behavior
The modelReport() function is expected to run through using various model methods.

Add here the error message:

Error in predict.randomForest(object@model, data, type = "prob") : 
  Type of predictors in new data do not match that of the training data.

Additional Context

> model
Object of class SDMmodel 
Method: RF 

Species: Virtual species 
Presence locations: 320 
Absence locations: 5000 

Model configurations:
--------------------
mtry: 3
ntree: 500
nodesize: 1

Variables:
---------
Continuous: bio1 bio12 bio16 bio17 bio5 bio6 bio7 bio8 
Categorical: biome

> model@model@model

Call:
 randomForest(x = x, y = as.factor(p), ntree = ntree, mtry = mtry) 
               Type of random forest: classification
                     Number of trees: 500
No. of variables tried at each split: 3

        OOB estimate of  error rate: 9.3%
Confusion matrix:
     0   1 class.error
0 4825 175       0.035
1  320   0       1.000
> test
Object of class SWD 

Species: Virtual species 
Presence locations: 80 
Absence locations: 5000 

Variables:
---------
Continuous: bio1 bio12 bio16 bio17 bio5 bio6 bio7 bio8 
Categorical: biome 
> predictors
class      : RasterStack 
dimensions : 192, 186, 35712, 9  (nrow, ncol, ncell, nlayers)
resolution : 0.5, 0.5  (x, y)
extent     : -125, -32, -56, 40  (xmin, xmax, ymin, ymax)
crs        : +proj=longlat +datum=WGS84 +no_defs 
names      : bio1, bio12, bio16, bio17, bio5, bio6, bio7, bio8, biome 
min values :  -23,     0,     0,     0,   61, -212,   60,  -66,     1 
max values :  289,  7682,  2458,  1496,  422,  242,  461,  323,    14 
> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] kableExtra_1.3.1 SDMtune_1.1.3   

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.5          highr_0.8           plyr_1.8.6          pillar_1.4.7        compiler_4.0.2      plotROC_2.2.1      
 [7] tools_4.0.2         digest_0.6.27       viridisLite_0.3.0   evaluate_0.14       lifecycle_0.2.0     tibble_3.0.4       
[13] gtable_0.3.0        lattice_0.20-41     pkgconfig_2.0.3     rlang_0.4.9         cli_2.2.0           rstudioapi_0.13    
[19] yaml_2.2.1          rgdal_1.5-18        xfun_0.19           dismo_1.3-3         dplyr_1.0.2         httr_1.4.2         
[25] stringr_1.4.0       raster_3.4-5        knitr_1.30          xml2_1.3.2          generics_0.1.0      vctrs_0.3.5        
[31] webshot_0.5.2       grid_4.0.2          tidyselect_1.1.0    glue_1.4.2          R6_2.5.0            fansi_0.4.1        
[37] rmarkdown_2.5       sp_1.4-4            farver_2.0.3        ggplot2_3.3.2       purrr_0.3.4         magrittr_2.0.1     
[43] scales_1.1.1        codetools_0.2-18    ellipsis_0.3.1      htmltools_0.5.0     assertthat_0.2.1    randomForest_4.6-14
[49] rvest_0.3.6         colorspace_2.0-0    labeling_0.4.2      stringi_1.5.3       munsell_0.5.0       crayon_1.3.4       

Thanks for that great tool...

This R package's variable selection method unveiled and recommended important climatic variables of some endemic plant species with less presence points... As a conservationist that is what we are loooking for in our modeling studies...
Thanks again...

[Enhancement] Accepting single raster as well

This is not a bug report actually but I couldn't label it so reporting in bug category.
I really prefer this package compare to biomod2 or dismo, sdm packages. I've been used it since 2020 and I hope if there is new feature, it would be really much nicer.
Since I am trying to modelling with virtual species, most of my research is conducted with single or just few variables. Biomod2 accepting single variable but it seems SDMtune package with prepareSWD function only accepting raster stack variables.
Hope it could accept single raster variable as well in near future.
Thanks!

sincerely,
JEON

[Feature request]: modelReport for object class SDMmodelCV

Describe the new feature

I am using SDMtune to run a MaxEnt model with replicates. I want to generate a report similar to the original maxent output, which this package seems to be capable of via its modelReport function. However, it seems that this is not compatible with the type of object that is created by MaxEnt models with replicate runs (SDMmodelCV object). I would love if the function worked with a model with replicates as well.

Does the feature request already exist?

  • I have check if the same feature request already exists

[BUG] Real time charts stop to refresh

Describe the bug
The real time charts stop to refresh. This doesn't happen always, and it happens at different times during the optimiseModel function execution.

To Reproduce
Steps to reproduce the behavior:

  1. Run several times the optimiseModel function until it stops

Expected behavior
If the charts is not updated you will see that the setInterval function stops to send the ajax request because it gets the final n value..

Screenshots

Additional context

Disable real-time graph???

Hello,
is there any way to turn off the realtime graph globally or partially? I want to run different functions from their package in a parallel foreach loop. This works very well, however for each individual instance of the parallel foreach loop it opens a separate window in the browser (firefox) in which it displays and updates the corresponding realtime graph. I would like to stop this behavior. Is there a simple way?

Thanks.
best regards Frank

Response Curve error when creating model report

Over the past few days I've had an error pop up occasionally. Never seen it before, despite using the package over the past 4 or 5 months.

The model report function starts working, but then I get:
-- Model Report - method: Maxnet ------------------------------- HemiRisk2019 --
v Saving files...
v Plotting ROC curve...
v Computing thresholds...
v Plotting marginal response curves...

  • Plotting univariate response curves...Quitting from lines 84-99 (modelReport.Rmd)
    Error in sqrt(np) : non-numeric argument to mathematical function

I've noticed this pop up when response_curves=TRUE, obviously, and fc="lp" in the model. I just had it happen again twice. When I changed to fc="lph", for the same data, everything is fine. I'm pretty sure fc="lp" models have worked in the past though. I pretty much always use cloglog, if that might matter.

[Bug]: function auc() report error when used alone

Describe the bug

function auc() report error when used alone

Steps to reproduce the bug

library(SDMtune)
library(sp)
library(raster)
library(dismo)
library(rgdal)
library(rJava)
library(dplyr)
library(magrittr)
library(ade4)
library(ape)
library(gbm)
#library(ecospat)
library(sf)
library(doSNOW)
library(ENMeval)
library(rasterVis)
library(magrittr)
library(SDMtune)
library(zeallot)
library(ggplot2)   
library(maps)       
library(lattice)
library(plotROC)
library(kableExtra)
library(pROC)
library(terra)
library(ISwR)
library(PMCMRplus)


# Set a random seed in order to be able to reproduce this analysis.
set.seed(0)

occs <- read.table("D:/SXT/bamrepanda/panda_range/PANDA.csv", header=TRUE, sep=',')

occs <- occs[!duplicated(occs),]

#occs <- na.omit(occs)   

slopsin<-raster('D:/SXT/clipmap/else/right/slopesin1.tif')


elev=raster('D:/SXT/newdata/maps/bufferareaDEM/bufdem30mutm.tif')

landcover <- raster('D:/SXT/bamrepanda/ouyangNEE_model/land_cover_re30m.tif')

#bioๅ› ๅญ

bio1<-raster('D:/SXT/DS-cor-maps/current/bio/bio1.tif')
# bio11<-resample(bio1,a)
# bio1<-bio11

bio5<-raster('D:/SXT/DS-cor-maps/current/bio/bio5.tif')
bio6<-raster('D:/SXT/DS-cor-maps/current/bio/bio6.tif')
bio12<-raster('D:/SXT/DS-cor-maps/current/bio/bio12.tif')

#bamboo<-raster('D:/SXT/bam_mod/mod1km_map/diversity/diversity_cur.tif')

envs_stack = stack(slopsin,elev,landcover,bio1,bio5,bio6,bio12)
names(envs_stack) <- c('slopsin','elev','landcover','bio1','bio5','bio6','bio12')
envs_stack$landcover <- raster::as.factor(envs_stack$landcover)


occs.cells <- raster::extract(envs_stack, occs, cellnumbers = TRUE)
occs.cellDups <- duplicated(occs.cells[,1])
occs <- occs[!occs.cellDups,]

bg <- dismo::randomPoints(envs_stack,10000) %>% as.data.frame()
colnames(bg) <- colnames(occs)

bg.cells <- raster::extract(envs_stack, bg, cellnumbers = TRUE)
bg.cellDups <- duplicated(bg.cells[,1])
bg <- bg[!bg.cellDups,]


data <- prepareSWD(species = "panda", p = occs, a =bg,
                   env = envs_stack, categorical = "landcover")

require(ENMeval)

## Checkerboard1 partition using the ENMeval package
cb_folds <- get.checkerboard2(occ = data@coords[data@pa == 1,], bg = data@coords[data@pa == 0,],
                              env =envs_stack, aggregation.factor = 4 )

model <- train(method = "Maxent", data = data, fc = c('lqpht'), reg = 1,
               folds = cb_folds)
saveRDS(model,"D:/SXT/bamrepanda/panda_bamboo/modelA/modelA.rda")



#่Žทๅพ—ๅ˜้‡้‡่ฆๆ€ง

vi<-maxentVarImp(model)
vi

vi1<-varImp(model,permut = 10)
plotVarImp(vi1)

jkvi<-doJk(model,metric = 'auc')

# plotJk(jkvi,
#        type = 'train',
#        ref = auc(model))
#
# plotJk(jkvi,
#        type = 'test',
#        ref = auc(model))

plotResponse(model,
             var='bio12',
             type = 'logistic',
             only_presence = TRUE,
             marginal = FALSE,
             rug = TRUE)


#AUC
auc=auc(model)
auc
#ROC
#plotROC(model)

#TSS
tss=tss(model)
tss

Session information

> summary(modelA)
    Length      Class       Mode
         1 SDMmodelCV         S4
> modelA
Object of class SDMmodelCV
Method: Maxent

Species: panda
Replicates: 4
Presence locations: 822
Absence locations: 9999

Model configurations:
--------------------
fc: lqpht
reg: 1
iter: 500

Variables:
---------
Continuous: slopsin elev bio1 bio5 bio6 bio12
Categorical: landcover
> #AUC
> auc=auc(modelA)
Error in roc.default(response, predictor, auc = TRUE, ...) :
  No valid data provided.
> auc
function (...)
{
    UseMethod("auc")
}
<bytecode: 0x000002a6d27c2ef0>
<environment: namespace:pROC>
> #ROC
> #plotROC(model)
>
> #TSS
> tss=tss(modelA)
> tss
[1] 0.6508162
>

Additional information

No response

Reproducible example

  • I have done my best to provide the steps to reproduce the bug

Customizing response curves

Describe the new feature

Hello!
I've been using SDMtune for some time now and I really enjoy using it. This is a great tool to run SDMs. So first of all, thank you for this great package!

This is more of a question than a feature request (sorry for putting it here, but this seemed more appropriate than the bug report).
I'm working on SDMs of two amphibian species and was thinking about plotting the responses of the two species within a single plotting pane (for each variable). But since the "plotResponse" function returns one plot for one species and one variabe at a time I was wondering if there's a way to "pull out" data for response curves so that the response plots can be customized (e.g. in ggplot)?

Thank you in advance for the help!
Cheers,
Yucheol

Does the feature request already exist?

  • I have check if the same feature request already exists

[Bug]: predict, maxent method - find_dims error

Describe the bug

When trying to use SDMtune::predict() to extract predictors over the entire extent of the environmental rasters, the following error is produced: Error in find_dims(object, model, nc, fun, const, na.rm, index, ...) :
could not find function "find_dims"

Calculating predictions for the pres/abs locations using the swd object as opposed to the predictors object works fine.

Steps to reproduce the bug

library(SDMtune)

map <- SDMtune::predict(model5,
               data = predictors,
               type = "cloglog")

Session information

R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.utf8  LC_CTYPE=English_United Kingdom.utf8    LC_MONETARY=English_United Kingdom.utf8
[4] LC_NUMERIC=C                            LC_TIME=English_United Kingdom.utf8    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] gridExtra_2.3        rJava_1.0-6          GGally_2.1.2         SDMtune_1.3.1        robis_2.11.3         MultiscaleDTM_0.5.3 
 [7] terra_1.7-3          raster_3.6-14        sp_1.6-0             lubridate_1.9.2      forcats_1.0.0        stringr_1.5.0       
[13] dplyr_1.1.2          purrr_1.0.1          readr_2.1.4          tidyr_1.3.0          tibble_3.2.1         ggplot2_3.4.2       
[19] tidyverse_2.0.0      sdmpredictors_0.2.15

loaded via a namespace (and not attached):
 [1] httr_1.4.7         maps_3.4.1         jsonlite_1.8.7     shiny_1.7.4        progress_1.2.2     pillar_1.9.0      
 [7] lattice_0.20-45    glue_1.6.2         digest_0.6.31      RColorBrewer_1.1-3 promises_1.2.0.1   colorspace_2.1-0  
[13] plyr_1.8.8         htmltools_0.5.4    httpuv_1.6.8       pkgconfig_2.0.3    xtable_1.8-4       scales_1.2.1      
[19] later_1.3.0        mapedit_0.6.0      tzdb_0.3.0         timechange_0.2.0   proxy_0.4-27       farver_2.1.1      
[25] generics_0.1.3     ellipsis_0.3.2     withr_2.5.1        cli_3.6.0          crayon_1.5.2       magrittr_2.0.3    
[31] mime_0.12          fansi_1.0.4        xml2_1.3.5         class_7.3-20       prettyunits_1.1.1  tools_4.2.2       
[37] dismo_1.3-9        data.table_1.14.6  hms_1.1.2          lifecycle_1.0.3    munsell_0.5.0      compiler_4.2.2    
[43] e1071_1.7-13       rlang_1.1.1        classInt_0.4-10    units_0.8-4        grid_4.2.2         rstudioapi_0.14   
[49] htmlwidgets_1.6.1  crosstalk_1.2.0    labeling_0.4.2     base64enc_0.1-3    gtable_0.3.3       codetools_0.2-18  
[55] reshape_0.8.9      DBI_1.1.3          curl_5.0.0         R6_2.5.1           rgdal_1.6-4        knitr_1.42        
[61] fastmap_1.1.0      utf8_1.2.2         KernSmooth_2.23-20 stringi_1.7.12     parallel_4.2.2     Rcpp_1.0.10       
[67] vctrs_0.6.2        sf_1.0-14          rgl_1.0.1          leaflet_2.1.1      tidyselect_1.2.0   xfun_0.36

Additional information

No response

Reproducible example

  • I have done my best to provide the steps to reproduce the bug

[Bug]: Please remove dependencies on **rgdal**, **rgeos**, and/or **maptools**

Describe the bug

This package depends on (depends, imports or suggests) raster and one or more of the retiring packages rgdal, rgeos or maptools (https://r-spatial.org/r/2022/04/12/evolution.html, https://r-spatial.org/r/2022/12/14/evolution2.html). Since raster 3.6.3, all use of external FOSS library functionality has been transferred to terra, making the retiring packages very likely redundant. It would help greatly if you could remove dependencies on the retiring packages as soon as possible.

Steps to reproduce the bug

This package depends on (depends, imports or suggests) **raster** and one or more of the retiring packages **rgdal**, **rgeos** or **maptools** (https://r-spatial.org/r/2022/04/12/evolution.html, https://r-spatial.org/r/2022/12/14/evolution2.html). Since **raster** `3.6.3`, all use of external FOSS library functionality has been transferred to **terra**, making the retiring packages very likely redundant. It would help greatly if you could remove dependencies on the retiring packages as soon as possible.

Session information

This package depends on (depends, imports or suggests) **raster** and one or more of the retiring packages **rgdal**, **rgeos** or **maptools** (https://r-spatial.org/r/2022/04/12/evolution.html, https://r-spatial.org/r/2022/12/14/evolution2.html). Since **raster** `3.6.3`, all use of external FOSS library functionality has been transferred to **terra**, making the retiring packages very likely redundant. It would help greatly if you could remove dependencies on the retiring packages as soon as possible.

Additional information

This package depends on (depends, imports or suggests) raster and one or more of the retiring packages rgdal, rgeos or maptools (https://r-spatial.org/r/2022/04/12/evolution.html, https://r-spatial.org/r/2022/12/14/evolution2.html). Since raster 3.6.3, all use of external FOSS library functionality has been transferred to terra, making the retiring packages very likely redundant. It would help greatly if you could remove dependencies on the retiring packages as soon as possible.

Reproducible example

  • I have done my best to provide the steps to reproduce the bug

[Feature request]: convert SDMmodelCV to SDMmodel

Describe the new feature

It's very convenient that we can pass spatial folds to the model training and get a SDMmodelCV object, which a number of functions take as input. However, a number of other functions do not accept SDMmodelCV objects. Would be helpful to have a function that "collapses" a SDMmodelCV object to a SDMmodel object, which would mean lumping all of the occurrences and backgrounds and retraining the model with the parameters found in the SDMmodelCV object. As far as I can tell, this does not currently exist. Just a suggestion.

Does the feature request already exist?

  • I have check if the same feature request already exists

[Feature request]: Add 10% training omission rate

Describe the new feature

Hi ,

It's possible to include the 10% training omission rate in the function thresholds?

Thank you

Does the feature request already exist?

  • I have check if the same feature request already exists

[Bug]: Error in mm %*% object$betas : non-conformable arguments with gridSearch and hinge feature even with addSamplesToBg

Describe the bug

Dear Friends,

Every time I add hinge feature (alone) to hyper parameters to fine tune a Maxent model with gridSearch, even using addSamplesToBg before, I get Error in mm %*% object$betas : non-conformable arguments message after RM > 4.

I have the same problema with different occurrence data sets from different species, all using 19 Worldclim 2.1 bioclimatic variables together. The intent is to run a complete model for tuning and then reduce variables.

Steps to reproduce the bug

library(SDMtune)

presence.points.coord <- presence.points.model[c("x", "y")]
background.points.coord <- background.points.model[c("x", "y")]

data.model.SWD <- prepareSWD(
  species = my.species, 
  p = presence.points.coord, 
  a = background.points.coord,
  env = predictors.model,
)

data.model.SWD <- addSamplesToBg(data.model.SWD, all = TRUE)
folds <- randomFolds(
  data.model.SWD,
  k = 4,
  only_presence = FALSE,
  seed = 1968
)

maxent.model.base <- train(
  method = "Maxnet", 
  data = data.model.SWD, 
  folds = folds
)

tune.grid <- list(
  reg = seq(1, 5, 0.5),
  fc = c("l", "q", "h", "lq", "qh", "lqh", "lqph")
)

maxent.model.tune <- gridSearch(
  maxent.model.base,
  hypers = tune.grid,
  metric = "tss",
  # test = val.data.SWD
)

Session information

> sessionInfo()
R version 4.3.3 (2024-02-29)
Platform: x86_64-apple-darwin20 (64-bit)
Running under: macOS Big Sur 11.7.10

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/Boa_Vista
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] MASS_7.3-60.0.1          grinnell_0.0.22          rnaturalearthhires_1.0.0
 [4] flexsdm_1.3.4            htmlwidgets_1.6.4        datazoom.amazonia_1.1.0 
 [7] bdc_1.1.4                SDMtune_1.3.1            raster_3.6-26           
[10] sp_2.1-3                 flextable_0.9.5          gtsummary_1.7.2         
[13] officer_0.6.5            tidyr_1.3.1              data.table_1.15.2       
[16] kableExtra_1.4.0         formattable_0.2.1        gt_0.10.1               
[19] ggpubr_0.6.0             countrycode_1.6.0        geobr_1.8.2             
[22] scales_1.3.0             viridis_0.6.5            viridisLite_0.4.2       
[25] ENMeval_2.0.4            magrittr_2.0.3           pROC_1.18.5             
[28] usdm_2.1-7               RColorBrewer_1.1-3       corrplot_0.92           
[31] dplyr_1.1.4              geodata_0.5-9            terra_1.7-71            
[34] CoordinateCleaner_3.0.1  maps_3.4.2               tmap_3.3-4              
[37] sf_1.0-16                ggplot2_3.5.0            pacman_0.5.1            

loaded via a namespace (and not attached):
  [1] splines_4.3.3           later_1.3.2             rnaturalearth_1.0.1    
  [4] fields_15.2             tibble_3.2.1            cellranger_1.1.0       
  [7] XML_3.99-0.16.1         lifecycle_1.0.4         rstatix_0.7.2          
 [10] rprojroot_2.0.4         doParallel_1.0.17       processx_3.8.4         
 [13] lattice_0.22-6          crosstalk_1.2.1         exactextractr_0.10.0   
 [16] backports_1.4.1         rmarkdown_2.26          remotes_2.5.0          
 [19] httpuv_1.6.15           spam_2.10-0             zip_2.3.1              
 [22] askpass_1.2.0           pkgbuild_1.4.4          DBI_1.2.2              
 [25] abind_1.4-5             purrr_1.0.2             nnet_7.3-19            
 [28] openxlsx2_1.5           gdtools_0.3.7           gbm_2.1.9              
 [31] crul_1.4.0              ellipse_0.5.0           units_0.8-5            
 [34] svglite_2.1.3           codetools_0.2-19        dismo_1.3-14           
 [37] RApiSerialize_0.1.2     DT_0.32                 xml2_1.3.6             
 [40] shape_1.4.6.1           tidyselect_1.2.1        farver_2.1.1           
 [43] httpcode_0.3.0          base64enc_0.1-3         broom.helpers_1.14.0   
 [46] jsonlite_1.8.8          rgnparser_0.3.0         e1071_1.7-14           
 [49] survival_3.5-8          iterators_1.0.14        systemfonts_1.0.6      
 [52] foreach_1.5.2           tools_4.3.3             ragg_1.3.0             
 [55] Rcpp_1.0.12             glue_1.7.0              contentid_0.0.18       
 [58] gridExtra_2.3           here_1.0.1              xfun_0.43              
 [61] mgcv_1.9-1              withr_3.0.0             fastmap_1.1.1          
 [64] fansi_1.0.6             openssl_2.1.1           callr_3.7.6            
 [67] digest_0.6.35           R6_2.5.1                mime_0.12              
 [70] wk_0.9.1                textshaping_0.3.7       colorspace_2.1-0       
 [73] maxnet_0.1.4            dichromat_2.0-0.1       utf8_1.2.4             
 [76] generics_0.1.3          fontLiberation_0.1.0    class_7.3-22           
 [79] httr_1.4.7              tmaptools_3.1-1         whisker_0.4.1          
 [82] pkgconfig_2.0.3         gtable_0.3.4            sys_3.4.2              
 [85] htmltools_0.5.8         fontBitstreamVera_0.1.1 carData_3.0-5          
 [88] dotCall64_1.1-1         spThin_0.2.0            png_0.1-8              
 [91] knitr_1.45              rstudioapi_0.16.0       tzdb_0.4.0             
 [94] geosphere_1.5-18        rgbif_3.7.9             uuid_1.2-0             
 [97] nlme_3.1-164            curl_5.2.1              cachem_1.0.8           
[100] proxy_0.4-27            stringr_1.5.1           KernSmooth_2.23-22     
[103] parallel_4.3.3          desc_1.4.3              s2_1.1.6               
[106] leafsync_0.1.0          pillar_1.9.0            grid_4.3.3             
[109] vctrs_0.6.5             promises_1.2.1          randomForest_4.7-1.1   
[112] stringfish_0.16.0       car_3.1-2               dbplyr_2.5.0           
[115] xtable_1.8-4            evaluate_0.23           oai_0.4.0              
[118] readr_2.1.5             cli_3.6.2               compiler_4.3.3         
[121] rlang_1.1.3             crayon_1.5.2            ggsignif_0.6.4         
[124] labeling_0.4.3          classInt_0.4-10         ps_1.7.6               
[127] plyr_1.8.9              fs_1.6.3                stringi_1.8.3          
[130] stars_0.6-4             taxadb_0.2.1            munsell_0.5.0          
[133] lazyeval_0.2.2          leaflet_2.2.2           glmnet_4.1-8           
[136] fontquiver_0.2.1        Matrix_1.6-5            hms_1.1.3              
[139] patchwork_1.2.0         leafem_0.2.3            gfonts_0.2.0           
[142] shiny_1.8.1             qs_0.26.1               kernlab_0.9-32         
[145] memoise_2.0.1           broom_1.0.5             RcppParallel_5.1.7     
[148] lwgeom_0.2-14           readxl_1.4.3            ape_5.7-1

Additional information

Bactrocera dorsalis Filtered Clean Occ 2024-03-27.xlsx

Occurrences used to model.

Reproducible example

  • I have done my best to provide the steps to reproduce the bug

[BUG] varSel not working after model train

Hi, I'm trying to follow the package's workflow, but the same error always occurs in the varSel step.

set.seed(0)
bg_full <- randomPoints(MEX_pred, n = 10000, onca_full, ext= MEX_pred, extf=1, excludep=TRUE, prob=FALSE,
cellnumbers=FALSE, tryf=3, warn=2, lonlatCorrection=TRUE)

onca_full <- data.frame(onca_full)
bg_full<-data.frame(bg_full)

data_full<- prepareSWD(species = "Panthera_onca", p = onca_full , a = bg_full,
env = MEX_pred)

#Spatialblocks
sac <- spatialAutoRange(rasterLayer = MEX_pred,
sampleNumber = 5000,
doParallel = TRUE,
showPlots = TRUE)

e_folds_full <- spatialBlock(speciesData = sp_df_full,
rasterLayer = MEX_pred,
theRange = 403023,
selection = "random",
iteration = 100, # find evenly dispersed folds
biomod2Format = FALSE,
xOffset = 0, # shift the blocks horizontally
yOffset = 0)

#Model
cv_model_full <- train(method = "Maxent", data = data_full, folds = e_folds_full, iter = 700)

#varSel

bg_var_sel <- prepareSWD(species = "Panthera onca", a = bg_full, env = MEX_pred)

plotCor(bg_var_sel, method = "spearman", cor_th = 0.7)

corVar(bg_var_sel, method = "spearman", cor_th = 0.7)

vs <- varSel(cv_model_full, metric = "auc", test = NULL,
bg4cor = bg_var_sel, method = "spearman",
cor_th = 0.7, permut = 10)

After some time i get this error :

Error in [<-.data.frame(*tmp*, variables[i], value = NULL) :
missing values are not allowed in subscripted assignments of data frames

error_bug

[Bug]:doJK freezes when a continuous variable has many zeros

Describe the bug

Hello Sergio,

I've been using SDMtune for a while (thanks for that!) and I'm using all possible models to ensemble them after. I'm doing a jackknife test for each model but it seems SDMtune::doJk freezes when one of the continuous variables has many zeros and the model is random forest.

Kindly look at my snippet where the SDMtune::doJk freezes at 6%

Thanks,
Arnan

Steps to reproduce the bug

library(SDMtune)
# Acquire environmental variables
files <- list.files(path = file.path(system.file(package = "dismo"), "ex"),
                    pattern = "grd", full.names = TRUE)
predictors <- raster::stack(files)

#Modify one raster to have many zeros (still a continuous variable)
ras_zero <- predictors[[1]]
ras_zero
ras_zero[ras_zero < 285] <- 0
predictors[[1]] <- ras_zero
plot(predictors[[1]])

#Presence-absence data
p_coords <- virtualSp$presence
a_coords <- virtualSp$absence

# Create SWD object
data <- prepareSWD(species = "Virtual species", p = p_coords, a = a_coords,
                   env = predictors)

# Cross-validation and jackknife test
folds <- randomFolds(data, k = 10, only_presence = F,seed=25)
model <- SDMtune::train("RF", data,folds = folds,ntree=500)
doJk(model,'auc') #stops at 6%

Session information

R version 4.2.1 (2022-06-23 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)

Matrix products: default

Random number generation:
 RNG:     Mersenne-Twister 
 Normal:  Inversion 
 Sample:  Rounding 
 
locale:
[1] LC_COLLATE=English_United Kingdom.utf8  LC_CTYPE=English_United Kingdom.utf8   
[3] LC_MONETARY=English_United Kingdom.utf8 LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.utf8    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] SDMtune_1.1.6 sp_1.5-0     

loaded via a namespace (and not attached):
 [1] progress_1.2.2       tidyselect_1.1.2     terra_1.6-17        
 [4] purrr_0.3.4          sf_1.0-8             lattice_0.20-45     
 [7] colorspace_2.0-3     vctrs_0.4.1          generics_0.1.3      
[10] utf8_1.2.2           rlang_1.0.6          e1071_1.7-11        
[13] pillar_1.8.1         glue_1.6.2           DBI_1.1.3           
[16] foreach_1.5.2        lifecycle_1.0.3      stringr_1.4.1       
[19] munsell_0.5.0        gtable_0.3.1         raster_3.6-3        
[22] codetools_0.2-18     class_7.3-20         fansi_1.0.3         
[25] Rcpp_1.0.9           KernSmooth_2.23-20   scales_1.2.1        
[28] classInt_0.4-7       ggplot2_3.3.6        hms_1.1.2           
[31] stringi_1.7.8        dplyr_1.0.10         dismo_1.3-9         
[34] grid_4.2.1           rgdal_1.5-32         cli_3.4.1           
[37] tools_4.2.1          magrittr_2.0.3       proxy_0.4-27        
[40] tibble_3.1.8         randomForest_4.7-1.1 pacman_0.5.1        
[43] crayon_1.5.1         pkgconfig_2.0.3      ellipsis_0.3.2      
[46] prettyunits_1.1.1    assertthat_0.2.1     rstudioapi_0.14     
[49] iterators_1.0.14     R6_2.5.1             units_0.8-0         
[52] compiler_4.2.1

Additional information

No response

Reproducible example

  • I have done my best to provide the steps to reproduce the bug

Error in predict when working with Random Forest and specify a categorial variable as a predictor

I got an error when working with a Random Forest model and specify a categorical variable as a predictor

files <- list.files(path = file.path(system.file(package = "dismo"), "ex"), pattern = "grd", full.names = TRUE)
predictors <- raster::stack(files)
help(virtualSp)
p_coords <- virtualSp$presence
bg_coords <- virtualSp$background
data <- prepareSWD(species = "Virtual species", p = p_coords, a = bg_coords, env = predictors, categorical = "biome")
default_model <- train(method = "RF", data = data)
map <- predict(default_model, data = predictors)

Error in predict.randomForest(object@model, data, type = "prob") : Type of predictors in new data do not match that of the training data.

[Bug]: varImp and maxentVarImp plots are not comparable

Describe the bug

varImp returns a data.frame with two columns while maxentVarImp returns a data.frame with three columns.
plotVarImp is simply plotting the first two columns, which are different. Plots are thus not comparable.

Steps to reproduce the bug

library(SDMtune)
## Calculate variable importance
vi1 <- varImp(maxent_model, permut = 10, progress = TRUE)
vi2 <- maxentVarImp(maxent_model)

## Plot variable importance
plotVarImp(vi1) # Plots Variable and Permutation_importance
plotVarImp(vi2) # Plots Variable and Percent_contribution

Session information

R version 4.2.0 (2022-04-22 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.utf8  LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] blockCV_3.1-3       zeallot_0.1.0       plotROC_2.3.0       rJava_1.0-6         rasterVis_0.51.5   
 [6] lattice_0.20-45     predicts_0.1-8      stars_0.6-3         abind_1.4-5         mapview_2.11.0.9006
[11] geodata_0.5-8       sf_1.0-14           terra_1.7-39        rgbif_3.7.7         usethis_2.2.2      
[16] janitor_2.2.0       lubridate_1.9.2     forcats_1.0.0       stringr_1.5.0       dplyr_1.1.2        
[21] purrr_1.0.1         readr_2.1.4         tidyr_1.3.0         tibble_3.2.1        ggplot2_3.4.2      
[26] tidyverse_2.0.0     SDMtune_1.3.1      

loaded via a namespace (and not attached):
 [1] fs_1.6.2            satellite_1.0.4     oai_0.4.0           RColorBrewer_1.1-3  httr_1.4.6         
 [6] tools_4.2.0         utf8_1.2.3          R6_2.5.1            KernSmooth_2.23-20  DBI_1.1.3          
[11] lazyeval_0.2.2      colorspace_2.1-0    raster_3.6-23       withr_2.5.0         sp_2.0-0           
[16] tidyselect_1.2.0    leaflet_2.1.2       compiler_4.2.0      leafem_0.2.0        cli_3.6.1          
[21] xml2_1.3.5          labeling_0.4.2      scales_1.2.1        classInt_0.4-9      hexbin_1.28.3      
[26] proxy_0.4-27        digest_0.6.33       dismo_1.3-14        base64enc_0.1-3     jpeg_0.1-10        
[31] pkgconfig_2.0.3     htmltools_0.5.6     fastmap_1.1.1       htmlwidgets_1.6.2   rlang_1.1.1        
[36] rstudioapi_0.15.0   farver_2.1.1        generics_0.1.3      zoo_1.8-12          jsonlite_1.8.7     
[41] crosstalk_1.2.0     magrittr_2.0.3      interp_1.1-4        Rcpp_1.0.11         munsell_0.5.0      
[46] fansi_1.0.4         lifecycle_1.0.3     stringi_1.7.12      whisker_0.4.1       snakecase_0.11.0   
[51] plyr_1.8.8          grid_4.2.0          parallel_4.2.0      deldir_1.0-9        hms_1.1.3          
[56] pillar_1.9.0        codetools_0.2-18    stats4_4.2.0        glue_1.6.2          latticeExtra_0.6-30
[61] data.table_1.14.8   png_0.1-8           vctrs_0.6.2         tzdb_0.4.0          gtable_0.3.3       
[66] lwgeom_0.2-13       e1071_1.7-13        class_7.3-20        viridisLite_0.4.2   sfheaders_0.4.3    
[71] units_0.8-3         timechange_0.2.0

Additional information

No response

Reproducible example

  • I have done my best to provide the steps to reproduce the bug

Error when plotting response for categorical predictors -RF-class [BUG]

Describe the bug
I'm getting an error message when I use the plotResponse() function to plot the marginal effect of a categorical predictor for a SDMmodelCV object (RF-class). I do not receive the same error message for models of class ANN or BRT trained on the same CV folds.

Expected behavior
I expect the function to produce a bar chart with the predicted probability of occurrence for each level of the categorical predictor +/- 1 SD. Instead I receive the error message:

Error in predict.randomForest(object@model, data, type = "prob") :
Type of predictors in new data do not match that of the training data.

Additional context
I gather this is an error message generated by the randomForest package when the factor levels in the new testing data do not match those found in the data used to train the model. I am using v. 4.6-14 of randomForest.

I think the error stems from the definition of the category levels in the plotResponse() function (L95 of the source code):

categ <- unique(as.numeric(levels(df[, var]))[df[, var]])

Converting the unique levels to class numeric results in a mismatch between the SWD_obj@data in which the categorical variable is class factor and the new object 'data' generated by the .get_plot_data() function (L170-182) that will be used to plot the predicted response. Specifically, this will cause the error described above when L191 is executed:

pred <- predict(model, data, type = type)

[Feature request]: have plotResponse invisibly return the data

Describe the new feature

Currently, plotResponse() will generate a ggplot-based plot. But if I would like to generate my own figures, it would be helpful if the function invisibly returned the data used to generate the plot (and by invisibly, I mean that if you assigned to a variable, the data would be passed to that variable, but it would not print the data to the console if not assigned. See base::invisible()).

Does the feature request already exist?

  • I have check if the same feature request already exists

How to parse the result from `reduceVar()` to get retained var names?

This is not really a bug, I believe, so I removed the [BUG] part of the issue title.

The context:

I need to run MaxEnt and RF fully automated. For that I'd like to get the names of the retained variables after using reduceVar() so I can match those names with names of raster layers in a raster stack to finally make predictions. Is that possible?

This is how far I have reached:

> reduc_var_model_sp@models[[1]]@data

Object of class SWD 

Species: Aedes aegypti 
Presence locations: 2624 
Absence locations: 9350 

Variables:
---------
Continuous: chelsa_bio01 chelsa_bio04 chelsa_bio05 chelsa_bio17 
Categorical: NA 

How can I get the names of Variables from there, both Continuous and Categorical (if there were any)?

Apologies if this is a silly question and thanks much in advance for any hints ๐Ÿ˜ธ

[Feature request]:

Describe the new feature

I would like to know if it is possible to project the error associated with the predicted relative probability of species occurrence to the extent of the environmental variables? I've not come across any information in the SDMTune literature or post on SDMTune GitHub mentioning this feature, but I would be very interested to know if this is in development.

Kind regards
Penny

Does the feature request already exist?

  • I have check if the same feature request already exists

[BUG] cluster error during predict with large raster stack

Describe the bug
Predict fails when using parallel = TRUE during a SDMtune::predict. Error message is:
in raster::clusterR(data, predict, args = list(model = model, clamp = clamp, :
cluster error

Raster stack being used is large, 14 layers each 9gb each. Doesn't appear to be a memory issue when observing windows task manager - plenty of memory left.

If I turn parallel = FALSE, new error occurs:
in p[-naind, ] <- predv : "number of items to replace is not a multiple of replacement length".

Reducing number of layers to less than 9 seems to fix it.

modelReport fails if response_curves = TRUE or jk=TRUE

If I set response_curves = TRUE or jk=TRUE, modelReport fails. I get:

plotting univariate response curves...Quitting from lines 84-99 (modelReport.Rmd)
Error in apply(pdata, 1, function(rr) !any(apply(ndata, 1, function(r) identical(r, :
dim(X) must have a positive length

OR

  • Running Jackknife test...Quitting from lines 142-156 (modelReport.Rmd)
    Error in apply(pdata, 1, function(rr) !any(apply(ndata, 1, function(r) identical(r, :
    dim(X) must have a positive length

It seems that something is wrong with data that is being passed to maxnet.

This happens with my data, but it also happens if I try to use the package's example. I didn't previously have this problem. I've recently updated SDMtune and several related packages (maxnet, dismo), though.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.