Dear Sergio et al.,
I've been trying out SDMtune, and I really like the streamlined analysis approach, visual feedback, and the genetic algorithm for reducing the hyperparameter search space. Good job!
Today I experimented with different model methods, and all works fine so far with the Maxnet, Maxent, BRT and ANN methods. However, there is an issue with the RF method, see BUG report below.
The same error appears using my own data, after variable selection, hyperparameter tuning and model parsimony optimization. The error message suggests that predict.randomForest()
cannot handle the passed argument newdata
, but I couldn't figure out what happens.
Am I doing something wrong? Any help would be warmly appreciated.
Many thanks and best wishes from Zurich,
Simon
Describe the bug
modelReport()
with the RF method cannot write predicted distribution map using the default virtualSp dataset.
To Reproduce
library(SDMtune)
# Acquire environmental variables
files <- list.files(path = file.path(system.file(package = "dismo"), "ex"),
pattern = "grd", full.names = TRUE)
predictors <- raster::stack(files)
# Prepare presence and background locations
p_coords <- virtualSp$presence
bg_coords <- virtualSp$background
# Create SWD object
data <- prepareSWD(species = "Virtual species", p = p_coords, a = bg_coords,
env = predictors, categorical = "biome")
# Split presence locations in training (80%) and testing (20%) datasets
datasets <- trainValTest(data, test = 0.2, only_presence = TRUE)
train <- datasets[[1]]
test <- datasets[[2]]
# Train a model using the RF method
model <- train(method = "RF", data = train)
# Create the report
modelReport(model, type = "cloglog", folder = "testfolder", test = test,
response_curves = FALSE, only_presence = TRUE, jk = TRUE,
env = predictors, permut = 2)
โโ Model Report - method: RF โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ Virtual species โโ
โ Saving files...
โ Plotting ROC curve...
โ Computing thresholds...
- Predicting distribution map...Quitting from lines 113-121 (modelReport.Rmd)
Error in predict.randomForest(object@model, data, type = "prob") :
Type of predictors in new data do not match that of the training data.
Expected behavior
The modelReport()
function is expected to run through using various model methods.
Add here the error message:
Error in predict.randomForest(object@model, data, type = "prob") :
Type of predictors in new data do not match that of the training data.
Additional Context
> model
Object of class SDMmodel
Method: RF
Species: Virtual species
Presence locations: 320
Absence locations: 5000
Model configurations:
--------------------
mtry: 3
ntree: 500
nodesize: 1
Variables:
---------
Continuous: bio1 bio12 bio16 bio17 bio5 bio6 bio7 bio8
Categorical: biome
> model@model@model
Call:
randomForest(x = x, y = as.factor(p), ntree = ntree, mtry = mtry)
Type of random forest: classification
Number of trees: 500
No. of variables tried at each split: 3
OOB estimate of error rate: 9.3%
Confusion matrix:
0 1 class.error
0 4825 175 0.035
1 320 0 1.000
> test
Object of class SWD
Species: Virtual species
Presence locations: 80
Absence locations: 5000
Variables:
---------
Continuous: bio1 bio12 bio16 bio17 bio5 bio6 bio7 bio8
Categorical: biome
> predictors
class : RasterStack
dimensions : 192, 186, 35712, 9 (nrow, ncol, ncell, nlayers)
resolution : 0.5, 0.5 (x, y)
extent : -125, -32, -56, 40 (xmin, xmax, ymin, ymax)
crs : +proj=longlat +datum=WGS84 +no_defs
names : bio1, bio12, bio16, bio17, bio5, bio6, bio7, bio8, biome
min values : -23, 0, 0, 0, 61, -212, 60, -66, 1
max values : 289, 7682, 2458, 1496, 422, 242, 461, 323, 14
> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] kableExtra_1.3.1 SDMtune_1.1.3
loaded via a namespace (and not attached):
[1] Rcpp_1.0.5 highr_0.8 plyr_1.8.6 pillar_1.4.7 compiler_4.0.2 plotROC_2.2.1
[7] tools_4.0.2 digest_0.6.27 viridisLite_0.3.0 evaluate_0.14 lifecycle_0.2.0 tibble_3.0.4
[13] gtable_0.3.0 lattice_0.20-41 pkgconfig_2.0.3 rlang_0.4.9 cli_2.2.0 rstudioapi_0.13
[19] yaml_2.2.1 rgdal_1.5-18 xfun_0.19 dismo_1.3-3 dplyr_1.0.2 httr_1.4.2
[25] stringr_1.4.0 raster_3.4-5 knitr_1.30 xml2_1.3.2 generics_0.1.0 vctrs_0.3.5
[31] webshot_0.5.2 grid_4.0.2 tidyselect_1.1.0 glue_1.4.2 R6_2.5.0 fansi_0.4.1
[37] rmarkdown_2.5 sp_1.4-4 farver_2.0.3 ggplot2_3.3.2 purrr_0.3.4 magrittr_2.0.1
[43] scales_1.1.1 codetools_0.2-18 ellipsis_0.3.1 htmltools_0.5.0 assertthat_0.2.1 randomForest_4.6-14
[49] rvest_0.3.6 colorspace_2.0-0 labeling_0.4.2 stringi_1.5.3 munsell_0.5.0 crayon_1.3.4