Comments (7)
Could you also share your {blockCV} code and proof the different outcomes using a common seed?
from mlr3spatiotempcv.
Thanks for your reply. Here's an expanded example to also uses {blockCV}.
library(mlr3)
library(mlr3spatiotempcv)
set.seed(123)
x <- runif(5000, -80.5, -75)
y <- runif(5000, 39.7, 42)
data <- data.frame(spp="test",
label=factor(round(runif(length(x), 0, 1))),
x=x,
y=y)
testTask <- TaskClassifST$new(id = "test",
backend = data,
target = "label",
positive="1",
extra_args = list(coordinate_names=c("x", "y"),
crs="EPSG: 4326"))
blockSamp <- rsmp("spcv_block",
folds=2,
range=50000,
selection="checkerboard")
blockSamp$instantiate(testTask)
plot(blockSamp, testTask)
#> CRS not set, transforming to WGS84 (EPSG: 4326).
library(blockCV)
library(sf)
#> Linking to GEOS 3.8.1, GDAL 3.1.4, PROJ 6.3.1
testSF <- st_as_sf(data[,c("x", "y", "spp")],
coords = c("x", "y"),
crs="EPSG: 4326")
testBlock <- spatialBlock(speciesData = testSF,
species="spp",
theRange=50000,
k=1,
selection="checkerboard")
#> although coordinates are longitude/latitude, st_intersects assumes that they are planar
#> although coordinates are longitude/latitude, st_intersects assumes that they are planar
#> although coordinates are longitude/latitude, st_intersects assumes that they are planar
#> train_test test_test
#> 1 2504 2496
#> 2 2496 2504
#> Warning in st_point_on_surface.sfc(sf::st_zm(x)): st_point_on_surface may not
#> give correct results for longitude/latitude data
plot(data$x[testBlock$folds[[1]][[1]]], data$y[testBlock$folds[[1]][[1]]],
col="red", pch=20, xlab="X_coord")
points(data$x[testBlock$folds[[1]][[2]]], data$y[testBlock$folds[[1]][[2]]],
col="orange", pch=20, ylab="Y_coord")
Created on 2021-03-09 by the reprex package (v1.0.0)
from mlr3spatiotempcv.
Thanks! This is a bug - I could reproduce it.
Fixing it in the next days.
from mlr3spatiotempcv.
Thanks for fixing this issue. I think I may have run into a new problem when using projected data and the latest mlr3spatiotempcv update.
The spatialBlock
approach works, but when is use rsmp, I get the followign error:
Error in st_geos_binop("intersects", x, y, sparse = sparse, prepared = prepared, : st_crs(x) == st_crs(y) is not TRUE
Here is a reproducible example. Session info is at the bottom.
library(mlr3)
library(mlr3spatiotempcv)
library(blockCV)
library(sf)
#> Linking to GEOS 3.8.1, GDAL 3.1.4, PROJ 6.3.1
library(raster)
#> Loading required package: sp
#>
#> Attaching package: 'raster'
#> The following object is masked from 'package:mlr3':
#>
#> resample
# prepare example data
set.seed(123)
x <- runif(5000, 1270260, 1778400)
y <- runif(5000, 1967070, 2292000)
data <- data.frame(spp="test",
label=factor(round(runif(length(x), 0, 1))),
x=x,
y=y)
# make sf object
# crs
crs <- "+proj=aea +lat_1=29.5 +lat_2=45.5 +lat_0=23 +lon_0=-96 +x_0=0 +y_0=0
+datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0"
testSF <- st_as_sf(data[,c("x", "y", "spp")],
coords = c("x", "y"),
crs=crs)
# make raster
r <- raster(extent(testSF), crs=crs)
r[] <- 10
# spatial block cv
testBlock <- spatialBlock(speciesData = testSF,
species="spp",
rasterLayer = r,
theRange=50000,
k=1,
selection="checkerboard")
#> train_test test_test
#> 1 2525 2475
#> 2 2475 2525
# plot
plot(data$x[testBlock$folds[[1]][[1]]], data$y[testBlock$folds[[1]][[1]]],
col="red", pch=20, xlab="X_coord")
points(data$x[testBlock$folds[[1]][[2]]], data$y[testBlock$folds[[1]][[2]]],
col="orange", pch=20, ylab="Y_coord")
# mlr3spatiotempcv implementation
testTask <- TaskClassifST$new(id = "test",
backend = data,
target = "label",
positive="1",
extra_args = list(coordinate_names=c("x", "y"),
crs=crs))
blockSamp <- rsmp("spcv_block",
folds=2,
range=50000,
selection="checkerboard")
blockSamp$instantiate(testTask)
#> Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO", prefer_proj =
#> prefer_proj): Discarded datum Unknown based on GRS80 ellipsoid in CRS definition
#> Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO", prefer_proj =
#> prefer_proj): Discarded datum Unknown based on GRS80 ellipsoid in CRS definition
#> Error in st_geos_binop("intersects", x, y, sparse = sparse, prepared = prepared, :
#> st_crs(x) == st_crs(y) is not TRUE
# print session
sessionInfo()
#> R version 4.0.3 (2020-10-10)
#> Platform: x86_64-apple-darwin17.0 (64-bit)
#> Running under: macOS Big Sur 10.16
#>
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
#>
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] raster_3.4-5 sp_1.4-5
#> [3] sf_0.9-7 blockCV_2.1.1
#> [5] mlr3spatiotempcv_0.2.0.9001 mlr3_0.11.0
#>
#> loaded via a namespace (and not attached):
#> [1] styler_1.3.2 progress_1.2.2 tidyselect_1.1.0
#> [4] xfun_0.20 purrr_0.3.4 lattice_0.20-41
#> [7] colorspace_2.0-0 vctrs_0.3.6 generics_0.1.0
#> [10] htmltools_0.5.0 yaml_2.2.1 utf8_1.1.4
#> [13] paradox_0.7.1 rlang_0.4.10 e1071_1.7-4
#> [16] pillar_1.5.1 glue_1.4.2 DBI_1.1.0
#> [19] palmerpenguins_0.1.0 uuid_0.1-4 lifecycle_1.0.0
#> [22] stringr_1.4.0 munsell_0.5.0 gtable_0.3.0
#> [25] codetools_0.2-16 evaluate_0.14 knitr_1.30
#> [28] parallel_4.0.3 class_7.3-17 fansi_0.4.2
#> [31] highr_0.8 Rcpp_1.0.6 KernSmooth_2.23-17
#> [34] backports_1.2.1 scales_1.1.1 classInt_0.4-3
#> [37] checkmate_2.0.0 farver_2.1.0 parallelly_1.23.0
#> [40] fs_1.5.0 ggplot2_3.3.3 hms_0.5.3
#> [43] digest_0.6.27 stringi_1.5.3 dplyr_1.0.2
#> [46] grid_4.0.3 rgdal_1.5-19 tools_4.0.3
#> [49] magrittr_2.0.1 tibble_3.1.0 mlr3misc_0.7.0
#> [52] crayon_1.4.1 pkgconfig_2.0.3 ellipsis_0.3.1
#> [55] data.table_1.14.0 prettyunits_1.1.1 reprex_1.0.0
#> [58] rmarkdown_2.6 lgr_0.4.2 R6_2.5.0
#> [61] units_0.6-7 compiler_4.0.3
Created on 2021-03-12 by the reprex package (v1.0.0)
Standard output and standard error
-- nothing to show --
from mlr3spatiotempcv.
Sorry to hear you ran into a new issue. Not enough tests yet it seems 😄
I'll have a look in the next days.
from mlr3spatiotempcv.
from mlr3spatiotempcv.
This was a tricky one.
First, the following example gives a memory overflow for me on macOS (but this is not really an issue downwards)
library(sf)
library(blockCV)
set.seed(123)
x <- runif(5000, 1270260, 1778400)
y <- runif(5000, 1967070, 2292000)
data <- data.frame(spp="test",
label=factor(round(runif(length(x), 0, 1))),
x=x,
y=y)
testSF_epsg4326 <- st_as_sf(data[,c("x", "y", "spp")],
coords = c("x", "y"),
crs="EPSG:4326")
# r <- raster::raster(raster::extent(testSF_epsg4326), crs="EPSG:4326")
# r[] <- 10
testBlock <- spatialBlock(speciesData = testSF_epsg4326,
species="spp",
# rasterLayer = r,
showBlocks = FALSE,
theRange=50000,
k=1,
selection="checkerboard")
The issue is the rasterLayer
argument here.
Commenting it out throws the same error as when using {mlr3spatiotempcv}.
Within {mlr3spatiotempcv}, rasterLayer
was not used at all so far.
This is because I thought it was optional and is only used for visualization purposes but it looks like I was wrong.
Here is a reprex using #94 as the base from which shows the new behavior
library(mlr3)
library(mlr3spatiotempcv)
library(sf)
#> Linking to GEOS 3.9.1, GDAL 3.2.2, PROJ 7.2.1
library(blockCV)
# prepare example data ---------------------------------------------------------
set.seed(123)
x <- runif(5000, 1270260, 1778400)
y <- runif(5000, 1967070, 2292000)
data <- data.frame(
spp = "test",
label = factor(round(runif(length(x), 0, 1))),
x = x,
y = y
)
crs <- "+proj=aea +lat_1=29.5 +lat_2=45.5 +lat_0=23 +lon_0=-96 +x_0=0 +y_0=0
+datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0"
testSF <- st_as_sf(data,
coords = c("x", "y"),
crs = crs
)
r <- raster::raster(raster::extent(testSF), crs = crs)
r[] <- 10
# mlr3spatiotempcv -------------------------------------------------------------
testTask <- TaskClassifST$new(
id = "test",
backend = testSF,
target = "label",
positive = "1"
)
blockSamp <- rsmp("spcv_block",
folds = 2,
range = 50000,
selection = "checkerboard",
rasterLayer = r
)
blockSamp$instantiate(testTask)
# blockCV directly -------------------------------------------------------------
testBlock <- spatialBlock(
speciesData = testSF,
species = "spp",
rasterLayer = r,
showBlocks = FALSE,
theRange = 50000,
verbose = FALSE,
selection = "checkerboard"
)
# check for equality
all.equal(blockSamp$instance$fold, testBlock$foldID)
#> [1] TRUE
Created on 2021-03-18 by the reprex package (v1.0.0)
FYI: Since v0.2.0 you can create TaskST
directly using sf
objects, the reprex gives an example.
from mlr3spatiotempcv.
Related Issues (20)
- Check out `spcosa` package
- spatial resampling for train and test set in computer vision cases HOT 1
- Loading mlr3spatiotempcv prevents pipelines with target variable transformations from making predictions HOT 2
- New SpCV method Zalazar et al.
- Handling of `sf` objects WRT `DataBackends` HOT 2
- Longterm play of Task*ST and DataBackends HOT 1
- `as_task_*_st` and friends could allow setting column roles directly HOT 2
- Update method help pages HOT 1
- as.data.table(mlr_resamplings) does not work without suggested packages
- Add label and man field to resamplings
- Clarify the use of column roles for grouping features HOT 2
- Task printer should show `time` and `space` column roles
- Log message during `private$sample()` when column roles "space" and "time" are set HOT 1
- sf object no longer accepted by TaskClassifST HOT 1
- CRAN 2.0.1 version produces bug when registering `sf` objects as spatial backend for `TaskClassifST` HOT 3
- `register_mlr3` fails due to non-matching columns HOT 1
- cleanup when unloading HOT 1
- Please remove dependencies on **rgdal**, **rgeos**, and/or **maptools** HOT 1
- Failure with the new version of **blockCV** HOT 5
- linnenbrink2023 reference broken in mlr3spatiotempcv vignette HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mlr3spatiotempcv.