Giter Site home page Giter Site logo

Comments (7)

pat-s avatar pat-s commented on May 23, 2024

Could you also share your {blockCV} code and proof the different outcomes using a common seed?

from mlr3spatiotempcv.

fitzLab-AL avatar fitzLab-AL commented on May 23, 2024

Thanks for your reply. Here's an expanded example to also uses {blockCV}.

library(mlr3)
library(mlr3spatiotempcv)


set.seed(123)
x <- runif(5000, -80.5, -75)
y <- runif(5000, 39.7, 42)

data <- data.frame(spp="test", 
                   label=factor(round(runif(length(x), 0, 1))),
                   x=x,
                   y=y)

testTask <- TaskClassifST$new(id = "test", 
                              backend = data, 
                              target = "label",
                              positive="1",
                              extra_args = list(coordinate_names=c("x", "y"),
                                                crs="EPSG: 4326"))

blockSamp <- rsmp("spcv_block",
                  folds=2,
                  range=50000,
                  selection="checkerboard")
blockSamp$instantiate(testTask)
plot(blockSamp, testTask)
#> CRS not set, transforming to WGS84 (EPSG: 4326).

library(blockCV)
library(sf)
#> Linking to GEOS 3.8.1, GDAL 3.1.4, PROJ 6.3.1

testSF <- st_as_sf(data[,c("x", "y", "spp")],
                   coords = c("x", "y"),
                   crs="EPSG: 4326")
testBlock <- spatialBlock(speciesData = testSF,
                          species="spp",
                          theRange=50000,
                          k=1,
                          selection="checkerboard")
#> although coordinates are longitude/latitude, st_intersects assumes that they are planar
#> although coordinates are longitude/latitude, st_intersects assumes that they are planar
#> although coordinates are longitude/latitude, st_intersects assumes that they are planar
#>   train_test test_test
#> 1       2504      2496
#> 2       2496      2504
#> Warning in st_point_on_surface.sfc(sf::st_zm(x)): st_point_on_surface may not
#> give correct results for longitude/latitude data

plot(data$x[testBlock$folds[[1]][[1]]], data$y[testBlock$folds[[1]][[1]]],
     col="red", pch=20, xlab="X_coord")
points(data$x[testBlock$folds[[1]][[2]]], data$y[testBlock$folds[[1]][[2]]],
     col="orange", pch=20, ylab="Y_coord")

Rplot03

Created on 2021-03-09 by the reprex package (v1.0.0)

from mlr3spatiotempcv.

pat-s avatar pat-s commented on May 23, 2024

Thanks! This is a bug - I could reproduce it.

Fixing it in the next days.

from mlr3spatiotempcv.

fitzLab-AL avatar fitzLab-AL commented on May 23, 2024

Thanks for fixing this issue. I think I may have run into a new problem when using projected data and the latest mlr3spatiotempcv update.

The spatialBlock approach works, but when is use rsmp, I get the followign error:

Error in st_geos_binop("intersects", x, y, sparse = sparse, prepared = prepared, : st_crs(x) == st_crs(y) is not TRUE

Here is a reproducible example. Session info is at the bottom.

library(mlr3)
library(mlr3spatiotempcv)
library(blockCV)
library(sf)
#> Linking to GEOS 3.8.1, GDAL 3.1.4, PROJ 6.3.1
library(raster)
#> Loading required package: sp
#> 
#> Attaching package: 'raster'
#> The following object is masked from 'package:mlr3':
#> 
#>     resample

# prepare example data
set.seed(123)
x <- runif(5000, 1270260, 1778400)
y <- runif(5000, 1967070, 2292000)

data <- data.frame(spp="test", 
                   label=factor(round(runif(length(x), 0, 1))),
                   x=x,
                   y=y)

# make sf object
# crs
crs <- "+proj=aea +lat_1=29.5 +lat_2=45.5 +lat_0=23 +lon_0=-96 +x_0=0 +y_0=0
+datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0"

testSF <- st_as_sf(data[,c("x", "y", "spp")],
                   coords = c("x", "y"),
                   crs=crs)

# make raster
r <- raster(extent(testSF), crs=crs)
r[] <- 10

# spatial block cv
testBlock <- spatialBlock(speciesData = testSF,
                          species="spp",
                          rasterLayer = r,
                          theRange=50000,
                          k=1,
                          selection="checkerboard")
#>   train_test test_test
#> 1       2525      2475
#> 2       2475      2525

# plot
plot(data$x[testBlock$folds[[1]][[1]]], data$y[testBlock$folds[[1]][[1]]],
     col="red", pch=20, xlab="X_coord")
points(data$x[testBlock$folds[[1]][[2]]], data$y[testBlock$folds[[1]][[2]]],
       col="orange", pch=20, ylab="Y_coord")

# mlr3spatiotempcv implementation
testTask <- TaskClassifST$new(id = "test", 
                              backend = data, 
                              target = "label",
                              positive="1",
                              extra_args = list(coordinate_names=c("x", "y"),
                                                crs=crs))

blockSamp <- rsmp("spcv_block",
                  folds=2,
                  range=50000,
                  selection="checkerboard")
blockSamp$instantiate(testTask)
#> Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO", prefer_proj =
#> prefer_proj): Discarded datum Unknown based on GRS80 ellipsoid in CRS definition

#> Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO", prefer_proj =
#> prefer_proj): Discarded datum Unknown based on GRS80 ellipsoid in CRS definition
#> Error in st_geos_binop("intersects", x, y, sparse = sparse, prepared = prepared, : 
#> st_crs(x) == st_crs(y) is not TRUE

# print session
sessionInfo()
#> R version 4.0.3 (2020-10-10)
#> Platform: x86_64-apple-darwin17.0 (64-bit)
#> Running under: macOS Big Sur 10.16
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] raster_3.4-5                sp_1.4-5                   
#> [3] sf_0.9-7                    blockCV_2.1.1              
#> [5] mlr3spatiotempcv_0.2.0.9001 mlr3_0.11.0                
#> 
#> loaded via a namespace (and not attached):
#>  [1] styler_1.3.2         progress_1.2.2       tidyselect_1.1.0    
#>  [4] xfun_0.20            purrr_0.3.4          lattice_0.20-41     
#>  [7] colorspace_2.0-0     vctrs_0.3.6          generics_0.1.0      
#> [10] htmltools_0.5.0      yaml_2.2.1           utf8_1.1.4          
#> [13] paradox_0.7.1        rlang_0.4.10         e1071_1.7-4         
#> [16] pillar_1.5.1         glue_1.4.2           DBI_1.1.0           
#> [19] palmerpenguins_0.1.0 uuid_0.1-4           lifecycle_1.0.0     
#> [22] stringr_1.4.0        munsell_0.5.0        gtable_0.3.0        
#> [25] codetools_0.2-16     evaluate_0.14        knitr_1.30          
#> [28] parallel_4.0.3       class_7.3-17         fansi_0.4.2         
#> [31] highr_0.8            Rcpp_1.0.6           KernSmooth_2.23-17  
#> [34] backports_1.2.1      scales_1.1.1         classInt_0.4-3      
#> [37] checkmate_2.0.0      farver_2.1.0         parallelly_1.23.0   
#> [40] fs_1.5.0             ggplot2_3.3.3        hms_0.5.3           
#> [43] digest_0.6.27        stringi_1.5.3        dplyr_1.0.2         
#> [46] grid_4.0.3           rgdal_1.5-19         tools_4.0.3         
#> [49] magrittr_2.0.1       tibble_3.1.0         mlr3misc_0.7.0      
#> [52] crayon_1.4.1         pkgconfig_2.0.3      ellipsis_0.3.1      
#> [55] data.table_1.14.0    prettyunits_1.1.1    reprex_1.0.0        
#> [58] rmarkdown_2.6        lgr_0.4.2            R6_2.5.0            
#> [61] units_0.6-7          compiler_4.0.3

Created on 2021-03-12 by the reprex package (v1.0.0)

Standard output and standard error
-- nothing to show --

from mlr3spatiotempcv.

pat-s avatar pat-s commented on May 23, 2024

Sorry to hear you ran into a new issue. Not enough tests yet it seems 😄

I'll have a look in the next days.

from mlr3spatiotempcv.

fitzLab-AL avatar fitzLab-AL commented on May 23, 2024

from mlr3spatiotempcv.

pat-s avatar pat-s commented on May 23, 2024

This was a tricky one.

First, the following example gives a memory overflow for me on macOS (but this is not really an issue downwards)

library(sf)
library(blockCV)

set.seed(123)
x <- runif(5000, 1270260, 1778400)
y <- runif(5000, 1967070, 2292000)
data <- data.frame(spp="test", 
                   label=factor(round(runif(length(x), 0, 1))),
                   x=x,
                   y=y)



testSF_epsg4326 <- st_as_sf(data[,c("x", "y", "spp")],
                            coords = c("x", "y"),
                            crs="EPSG:4326")

# r <- raster::raster(raster::extent(testSF_epsg4326), crs="EPSG:4326")
# r[] <- 10

testBlock <- spatialBlock(speciesData = testSF_epsg4326,
                          species="spp",
                          # rasterLayer = r,
                          showBlocks = FALSE,
                          theRange=50000,
                          k=1,
                          selection="checkerboard")

The issue is the rasterLayer argument here.
Commenting it out throws the same error as when using {mlr3spatiotempcv}.
Within {mlr3spatiotempcv}, rasterLayer was not used at all so far.
This is because I thought it was optional and is only used for visualization purposes but it looks like I was wrong.

Here is a reprex using #94 as the base from which shows the new behavior

library(mlr3)
library(mlr3spatiotempcv)
library(sf)
#> Linking to GEOS 3.9.1, GDAL 3.2.2, PROJ 7.2.1
library(blockCV)

# prepare example data ---------------------------------------------------------

set.seed(123)
x <- runif(5000, 1270260, 1778400)
y <- runif(5000, 1967070, 2292000)

data <- data.frame(
  spp = "test",
  label = factor(round(runif(length(x), 0, 1))),
  x = x,
  y = y
)

crs <- "+proj=aea +lat_1=29.5 +lat_2=45.5 +lat_0=23 +lon_0=-96 +x_0=0 +y_0=0
+datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0"

testSF <- st_as_sf(data,
  coords = c("x", "y"),
  crs = crs
)

r <- raster::raster(raster::extent(testSF), crs = crs)
r[] <- 10

# mlr3spatiotempcv -------------------------------------------------------------

testTask <- TaskClassifST$new(
  id = "test",
  backend = testSF,
  target = "label",
  positive = "1"
)

blockSamp <- rsmp("spcv_block",
  folds = 2,
  range = 50000,
  selection = "checkerboard",
  rasterLayer = r
)
blockSamp$instantiate(testTask)

# blockCV directly -------------------------------------------------------------

testBlock <- spatialBlock(
  speciesData = testSF,
  species = "spp",
  rasterLayer = r,
  showBlocks = FALSE,
  theRange = 50000,
  verbose = FALSE,
  selection = "checkerboard"
)

# check for equality
all.equal(blockSamp$instance$fold, testBlock$foldID)
#> [1] TRUE

Created on 2021-03-18 by the reprex package (v1.0.0)

FYI: Since v0.2.0 you can create TaskST directly using sf objects, the reprex gives an example.

from mlr3spatiotempcv.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.