prioritizr / prioritizr Goto Github PK

View Code? Open in Web Editor NEW

118.0 13.0 25.0 151.05 MB

Systematic conservation prioritization in R

Home Page: https://prioritizr.net

Makefile 0.21% R 90.78% C++ 7.28% C 0.61% CSS 0.01% TeX 1.10%

r spatial optimization conservation biodiversity prioritization solver conservation-planner rstats

prioritizr's Introduction

prioritizr

Systematic Conservation Prioritization in R

The prioritizr R package uses mixed integer linear programming (MILP) techniques to provide a flexible interface for building and solving conservation planning problems. It supports a broad range of objectives, constraints, and penalties that can be used to custom-tailor conservation planning problems to the specific needs of a conservation planning exercise. Once built, conservation planning problems can be solved using a variety of commercial and open-source exact algorithm solvers. In contrast to the algorithms conventionally used to solve conservation problems, such as heuristics or simulated annealing, the exact algorithms used here are guaranteed to find optimal solutions. Furthermore, conservation problems can be constructed to optimize the spatial allocation of different management actions or zones, meaning that conservation practitioners can identify solutions that benefit multiple stakeholders. Finally, this package has the functionality to read input data formatted for the Marxan conservation planning program, and find much cheaper solutions in a much shorter period of time than Marxan.

Installation

Official version

The latest official version of the prioritizr R package can be installed from the Comprehensive R Archive Network (CRAN) using the following R code.

install.packages("prioritizr", repos = "https://cran.rstudio.com/")

Developmental version

The latest development version can be installed to gain access to new functionality that is not yet present in the latest official version. Please note that the developmental version is more likely to contain coding errors than the official version. To install the developmental version, you can install it directly from the GitHub online code repository or from the R Universe. In general, we recommend installing the developmental version from the R Universe. This is because installation via R Universe does not require any additional software (e.g., RTools for Windows systems, or Xcode and gfortran for macOS systems).

To install the latest development version from R Universe, use the following R code.

install.packages(
  "prioritizr",
  repos = c(
    "https://prioritizr.r-universe.dev",
    "https://cloud.r-project.org"
  )
)

To install the latest development version from GitHub, use the following R code.

if (!require(remotes)) install.packages("remotes")
remotes::install_github("prioritizr/prioritizr")

Citation

Please cite the prioritizr R package when using it in publications. To cite the latest official version, please use:

Hanson JO, Schuster R, Morrell N, Strimas-Mackey M, Edwards BPM, Watts ME, Arcese P, Bennett J, Possingham HP (2023). prioritizr: Systematic Conservation Prioritization in R. R package version 8.0.3. Available at https://CRAN.R-project.org/package=prioritizr.

Alternatively, to cite the latest development version, please use:

Hanson JO, Schuster R, Morrell N, Strimas-Mackey M, Edwards BPM, Watts ME, Arcese P, Bennett J, Possingham HP (2024). prioritizr: Systematic Conservation Prioritization in R. R package version 8.0.3.5. Available at https://github.com/prioritizr/prioritizr.

Additionally, we keep a record of publications that use the prioritizr R package. If you use this package in any reports or publications, please file an issue on GitHub so we can add it to the record.

Usage

Here we provide a short example showing how the prioritizr R package can be used to build and solve conservation problems. Specifically, we will use an example dataset available through the prioritizrdata R package. Additionally, we will use the terra R package to perform raster calculations. To begin with, we will load the packages.

# load packages
library(prioritizr)
library(prioritizrdata)
library(terra)

We will use the Washington dataset in this example. To import the planning unit data, we will use the get_wa_pu() function. Although the prioritizr R package can support many different types of planning unit data, here our planning units are represented as a single-layer raster (i.e., terra::rast() object). Each cell represents a different planning unit, and cell values denote land acquisition costs. Specifically, there are 10757 planning units in total (i.e., cells with non-missing values).

# import planning unit data
wa_pu <- get_wa_pu()

# preview data
print(wa_pu)

## class       : SpatRaster 
## dimensions  : 109, 147, 1  (nrow, ncol, nlyr)
## resolution  : 4000, 4000  (x, y)
## extent      : -1816382, -1228382, 247483.5, 683483.5  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=laea +lat_0=45 +lon_0=-100 +x_0=0 +y_0=0 +ellps=sphere +units=m +no_defs 
## source      : wa_pu.tif 
## name        :         cost 
## min value   :    0.2986647 
## max value   : 1804.1838379

# plot data
plot(wa_pu, main = "Costs", axes = FALSE)

Next, we will use the get_wa_features() function to import the conservation feature data. Although the prioritizr R package can support many different types of feature data, here our feature data are represented as a multi-layer raster (i.e., terra::rast() object). Each layer describes the spatial distribution of a feature. Here, our feature data correspond to different bird species. To account for migratory patterns, the breeding and non-breeding distributions of species are represented as different features. Specifically, the cell values denote the relative abundance of individuals, with higher values indicating greater abundance.

# import feature data
wa_features <- get_wa_features()

# preview data
print(wa_features)

## class       : SpatRaster 
## dimensions  : 109, 147, 396  (nrow, ncol, nlyr)
## resolution  : 4000, 4000  (x, y)
## extent      : -1816382, -1228382, 247483.5, 683483.5  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=laea +lat_0=45 +lon_0=-100 +x_0=0 +y_0=0 +ellps=sphere +units=m +no_defs 
## source      : wa_features.tif 
## names       : Recur~ding), Botau~ding), Botau~ding), Corvu~ding), Corvu~ding), Cincl~full), ... 
## min values  :       0.000,       0.000,       0.000,       0.000,       0.000,        0.00, ... 
## max values  :       0.514,       0.812,       3.129,       0.115,       0.296,        0.06, ...

# plot the first nine features
plot(wa_features[[1:9]], nr = 3, axes = FALSE)

Let’s make sure that you have a solver installed on your computer. This is important so that you can use optimization algorithms to generate spatial prioritizations. If this is your first time using the prioritizr R package, please install the HiGHS solver using the following R code. Although the HiGHS solver is relatively fast and easy to install, please note that you’ll need to install the Gurobi software suite and the gurobi R package for best performance (see the Gurobi Installation Guide for details).

# if needed, install HiGHS solver
install.packages("highs", repos = "https://cran.rstudio.com/")

Now, let’s generate a spatial prioritization. To ensure feasibility, we will set a budget. Specifically, the total cost of the prioritization will represent a 5% of the total land value in the study area. Given this budget, we want the prioritization to increase feature representation, as much as possible, so that each feature would, ideally, have 20% of its distribution covered by the prioritization. In this scenario, we can either purchase all of the land inside a given planning unit, or none of the land inside a given planning unit. Thus we will create a new problem() that will use a minimum shortfall objective (via add_min_shortfall_objective()), with relative targets of 20% (via add_relative_targets()), binary decisions (via add_binary_decisions()), and specify that we want near-optimal solutions (i.e., 10% from optimality) using the best solver installed on our computer (via add_default_solver()).

# calculate budget
budget <- terra::global(wa_pu, "sum", na.rm = TRUE)[[1]] * 0.05

# create problem
p1 <-
  problem(wa_pu, features = wa_features) %>%
  add_min_shortfall_objective(budget) %>%
  add_relative_targets(0.2) %>%
  add_binary_decisions() %>%
  add_default_solver(gap = 0.1, verbose = FALSE)

# print problem
print(p1)

## A conservation problem (<ConservationProblem>)
## ├•data
## │├•features:    "Recurvirostra americana (breeding)" , … (396 total)
## │└•planning units:
## │ ├•data:       <SpatRaster> (10757 total)
## │ ├•costs:      continuous values (between 0.2987 and 1804.1838)
## │ ├•extent:     -1816381.6182, 247483.5211, -1228381.6182, 683483.5211 (xmin, ymin, xmax, ymax)
## │ └•CRS:        +proj=laea +lat_0=45 +lon_0=-100 +x_0=0 +y_0=0 +ellps=sphere +units=m +no_defs (projected)
## ├•formulation
## │├•objective:   minimum shortfall objective (`budget` = 8748.4908)
## │├•penalties:   none specified
## │├•targets:     relative targets (between 0.2 and 0.2)
## │├•constraints: none specified
## │└•decisions:   binary decision
## └•optimization
##  ├•portfolio:   shuffle portfolio (`number_solutions` = 1, …)
##  └•solver:      gurobi solver (`gap` = 0.1, `time_limit` = 2147483647, `first_feasible` = FALSE, …)
## # ℹ Use `summary(...)` to see complete formulation.

After we have built a problem(), we can solve it to obtain a solution.

# solve the problem
s1 <- solve(p1)

# extract the objective
print(attr(s1, "objective"))

## solution_1 
##    4.40521

# extract time spent solving the problem
print(attr(s1, "runtime"))

## solution_1 
##      3.394

# extract state message from the solver
print(attr(s1, "status"))

## solution_1 
##  "OPTIMAL"

# plot the solution
plot(s1, main = "Solution", axes = FALSE)

After generating a solution, it is important to evaluate it. Here, we will calculate the number of planning units selected by the solution, and the total cost of the solution. We can also check how many representation targets are met by the solution.

# calculate number of selected planning units by solution
eval_n_summary(p1, s1)

## # A tibble: 1 × 2
##   summary     n
##   <chr>   <dbl>
## 1 overall  2319

# calculate total cost of solution
eval_cost_summary(p1, s1)

## # A tibble: 1 × 2
##   summary  cost
##   <chr>   <dbl>
## 1 overall 8748.

# calculate target coverage for the solution
p1_target_coverage <- eval_target_coverage_summary(p1, s1)
print(p1_target_coverage)

## # A tibble: 396 × 9
##    feature   met   total_amount absolute_target absolute_held absolute_shortfall
##    <chr>     <lgl>        <dbl>           <dbl>         <dbl>              <dbl>
##  1 Recurvir… TRUE         100.             20.0          23.4               0   
##  2 Botaurus… TRUE          99.9            20.0          29.2               0   
##  3 Botaurus… TRUE         100.             20.0          34.0               0   
##  4 Corvus b… TRUE          99.9            20.0          20.2               0   
##  5 Corvus b… FALSE         99.9            20.0          18.7               1.29
##  6 Cinclus … TRUE         100.             20.0          20.4               0   
##  7 Spinus t… TRUE          99.9            20.0          22.4               0   
##  8 Spinus t… TRUE          99.9            20.0          23.0               0   
##  9 Falco sp… TRUE          99.9            20.0          24.5               0   
## 10 Falco sp… TRUE         100.             20.0          24.4               0   
## # ℹ 386 more rows
## # ℹ 3 more variables: relative_target <dbl>, relative_held <dbl>,
## #   relative_shortfall <dbl>

# check percentage of the features that have their target met given the solution
print(mean(p1_target_coverage$met) * 100)

## [1] 96.46465

Although this solution helps meet the representation targets, it does not account for existing protected areas inside the study area. As such, it does not account for the possibility that some features could be partially – or even fully – represented by existing protected areas and, in turn, might fail to identify meaningful priorities for new protected areas. To address this issue, we will use the get_wa_locked_in() function to import spatial data for protected areas in the study area. We will then add constraints to the problem() to ensure they are selected by the solution (via add_locked_in_constraints()).

# import locked in data
wa_locked_in <- get_wa_locked_in()

# print data
print(wa_locked_in)

## class       : SpatRaster 
## dimensions  : 109, 147, 1  (nrow, ncol, nlyr)
## resolution  : 4000, 4000  (x, y)
## extent      : -1816382, -1228382, 247483.5, 683483.5  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=laea +lat_0=45 +lon_0=-100 +x_0=0 +y_0=0 +ellps=sphere +units=m +no_defs 
## source      : wa_locked_in.tif 
## name        : protected areas 
## min value   :               0 
## max value   :               1

# plot data
plot(wa_locked_in, main = "Existing protected areas", axes = FALSE)

# create new problem with locked in constraints added to it
p2 <-
  p1 %>%
  add_locked_in_constraints(wa_locked_in)

# solve the problem
s2 <- solve(p2)

# plot the solution
plot(s2, main = "Solution", axes = FALSE)

This solution is an improvement over the previous solution. However, there are some places in the study area that are not available for protected area establishment (e.g., due to land tenure). As a consequence, the solution might not be practical for implementation, because it might select some places that are not available for protection. To address this issue, we will use the get_wa_locked_out() function to import spatial data describing which planning units are not available for protection. We will then add constraints to the problem() to ensure they are not selected by the solution (via add_locked_out_constraints()).

# import locked out data
wa_locked_out <- get_wa_locked_out()

# print data
print(wa_locked_out)

## class       : SpatRaster 
## dimensions  : 109, 147, 1  (nrow, ncol, nlyr)
## resolution  : 4000, 4000  (x, y)
## extent      : -1816382, -1228382, 247483.5, 683483.5  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=laea +lat_0=45 +lon_0=-100 +x_0=0 +y_0=0 +ellps=sphere +units=m +no_defs 
## source      : wa_locked_out.tif 
## name        : urban areas 
## min value   :           0 
## max value   :           1

# plot data
plot(wa_locked_out, main = "Areas not available for protection", axes = FALSE)

# create new problem with locked out constraints added to it
p3 <-
  p2 %>%
  add_locked_out_constraints(wa_locked_out)

# solve the problem
s3 <- solve(p3)

# plot the solution
plot(s3, main = "Solution", axes = FALSE)

This solution is even better then the previous solution. However, we are not finished yet. The planning units selected by the solution are fairly fragmented. This can cause issues because fragmentation increases management costs and reduces conservation benefits through edge effects. To address this issue, we can further modify the problem by adding penalties that punish overly fragmented solutions (via add_boundary_penalties()). Here we will use a penalty factor (i.e., boundary length modifier) of 0.003, and an edge factor of 50% so that planning units that occur on the outer edge of the study area are not overly penalized.

# create new problem with boundary penalties added to it
p4 <-
  p3 %>%
  add_boundary_penalties(penalty = 0.003, edge_factor = 0.5)

# solve the problem
s4 <- solve(p4)

# plot the solution
plot(s4, main = "Solution", axes = FALSE)

Now, let’s explore which planning units selected by the solution are most important for cost-effectively meeting the targets. To achieve this, we will calculate importance (irreplaceability) scores using the Ferrier method. Although this method produces scores for each feature separately, we will examine the total scores that summarize overall importance across all features.

# calculate importance scores
rc <-
  p4 %>%
  eval_ferrier_importance(s4)

# print scores
print(rc)

## class       : SpatRaster 
## dimensions  : 109, 147, 397  (nrow, ncol, nlyr)
## resolution  : 4000, 4000  (x, y)
## extent      : -1816382, -1228382, 247483.5, 683483.5  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=laea +lat_0=45 +lon_0=-100 +x_0=0 +y_0=0 +ellps=sphere +units=m +no_defs 
## source(s)   : memory
## varnames    : wa_pu 
##               wa_pu 
##               wa_pu 
##               ...
## names       :  Recur~ding),  Botau~ding),  Botau~ding),  Corvu~ding),  Corvu~ding),  Cincl~full), ... 
## min values  : 0.0000000000, 0.0000000000, 0.0000000000, 0.000000e+00, 0.000000e+00, 0.000000e+00, ... 
## max values  : 0.0003227724, 0.0002213034, 0.0006622152, 7.771815e-05, 8.974447e-05, 8.483296e-05, ...

# plot the total importance scores
## note that gray cells are not selected by the prioritization
plot(
  rc[["total"]], main = "Importance scores", axes = FALSE,
  breaks = c(0, 1e-10, 0.005, 0.01, 0.025),
  col = c("#e5e5e5", "#fff7ec", "#fc8d59", "#7f0000")
)

This short example demonstrates how the prioritizr R package can be used to build and customize conservation problems, and then solve them to generate solutions. Although we explored just a few different functions for modifying a conservation problem, the package provides many functions for specifying objectives, constraints, penalties, and decision variables, so that you can build and custom-tailor conservation planning problems to suit your planning scenario.

Learning resources

The package website contains information on the prioritizr R package. Here you can find documentation for every function and built-in dataset, and news describing the updates in each package version. It also contains the following articles and tutorials.

Getting started: Short tutorial on using the package.
Package overview: Introduction to systematic conservation planning and a comprehensive overview of the package.
Connectivity tutorial: Tutorial on incorporating connectivity into prioritizations.
Calibrating trade-offs tutorial: Tutorial on running calibration analyses to satisfy multiple criteria.
Management zones tutorial: Tutorial on incorporating multiple management zones and actions into prioritizations.
Gurobi installation guide: Instructions for installing the Gurobi optimization suite for generating prioritizations.
Solver benchmarks: Performance comparison of optimization solvers for generating prioritizations.
Publication record: List of publications that have cited the package.

Additional resources can also be found in online repositories under the prioritizr organization. These resources include slides for talks and seminars about the package. Additionally, workshop materials are available too (e.g., the Carleton 2023 workshop).

Getting help

If you have any questions about the prioritizr R package or suggestions for improving it, please post an issue on the code repository.

prioritizr's People

Contributors

Stargazers

Watchers

prioritizr's Issues

specifying time limit for the solver

Dear prioritizr users and developers
Could you please help me to find the method to specify a good “time limit” when the solver is trying to generate a solution?
When I don’t define this time limit, even with a small number of planning units (tutorial), after a long time the solver couldn’t reach to the optimum solution. But I don’t know how much time is enough for the solver, 10 min, 100 min or 1000 min?

Best regards
Iman

add_min_occ_constraints [feature proposal]

I propose adding a add_min_occ_constraints function that would allow users to add minimum occurrence constraints, similar to Marxan, such that each feature must be represented in n number of selected planning units.

The function could look something like this:

#' Add minimum occurrence constraints
#'
#' This function adds constraints to ensure that features occur in at least a pre-specified
#' number of planning units.
#'
#' @param x \code{\link{ConservationProblem}} object.
#'
#' @param n \code{numeric} minimum number of planning units that each feature needs to be 
#'      represented in. The argument to \code{n} can be single number to represent each feature in
#'      \code{n} in planning units, and a \code{numeric} vector can be used to ensure that
#'      each feature occurs in a minimum number of planning units.
#'
#' @details This constraint is only useful when the feature data are continuous (e.g. amounts or 
#'      proportions) and has no effect if the feature data are binary (i.e. zero or one).
#'
#' @return \code{\link{ConservationProblem}} with the constraint added to it.
#'
#' @seealso \code{\link{constraints}}.
#'
#' @export
add_min_occ_constraints <- function(x, n) {
   # insert magic here
}

What do people think? I think Marxan has this functionality. Would it be useful in prioritizr?

Saving and loading problem object

Creating and saving a problem object:

library(prioritizr)
p1 <- problem(sim_pu_polygons, features = sim_features,
              cost_column = "cost") %>%
      add_min_set_objective() %>%
      add_relative_targets(0.2) %>%
      add_binary_decisions()
save.image("temp.RData")

Subsequently quitting the R session and opening a new one:

library(prioritizr)
load("temp.RData")

When trying to access p1 I get this error message:
Error in vapply(list(self$objective, self$targets), function(x) { :
argument "self" is missing, with no default

I'm wondering if prioritzr is saving to the R temp folder and storing the folder name, which breaks the connection to p1 once a new R session is started?

License?

Hi all,

What license should we release the package under? I'm a fan of GPL version 3 - what do you guys think?

Cheers,

Jeff

EDIT: I've put the package license as GPL3 for the moment, but this can change if needed.

Personal websites

Hi all,

The prioritizr website has links to people's websites. Ages ago, I tried finding people's personal web pages by googling them, but you might prefer to have a different web page. If you want to change the web page that the prioritizr website to, you can edit the _pkgdown.yml file.

@ninamorrell you might like to add your personal website if you have one.

Cheers,

Jeff

Ingest Marxan data files

I think having the ability to ingest Marxan data files and solve it using ILP would be very useful. But how well supported should this be?

I think this feature will be crucial for showing first time users the benefits of using ILP. After the release of the earlier version of the prioritizr package, I had trouble getting people to try it.

So, I was thinking that the Marxan ingest function would ingest the files and then solve the problem. We intentionally would not allow users to customise the problem further by adding constraints using the standard syntax. This way users can see for themselves that ILP is awesome (well, hopefully haha), and then will want to learn how to create and customise problems using the standard syntax.

Thoughts?

add_rsymphony_solver gap > 1 error

When trying to use a gap > 1 add_rsymphony_solver throws an error, but according to the help page users should be allowed to specify gaps > 0 for this solver:
But for other solvers (e.g. Rsymhpony), this gap is absolute and expresses the acceptable deviance from the optimal objective. For example, solving a minimum set objective problem with a gap of 5 will cause the solver to terminate when the cost of the solution is within 5 cost units from the optimal solution.

library(prioritizr)
data(sim_pu_polygons)
data(sim_features)

p1 <- problem(sim_pu_polygons, features = sim_features,
cost_column = "cost") %>%
add_min_set_objective() %>%
add_relative_targets(0.2) %>%
add_binary_decisions() %>%
add_rsymphony_solver(gap = 2)

# Error: isTRUE(x = gap <= 1) is not TRUE

add_max_cover_objective [potential bug?]

I was reading over some of the older conservation planning literature, and the add_max_cover_objective doesn't follow the conventional formulation of the maximum covering problem. The conventional formulation aims to maximise the number of species that are represented in at least 1 planning unit. The current formulations aims to maximise the total level of representation in the solution.

Although this is documented in the help page, I worry that users might misinterpret what the add_max_cover_objective does. I suggest renaming the current version as add_max_utility_objective and implementing the classic maximum covering objective in add_max_cover_objective.

What do people think?

Documentation

Starting April 10, @ninamorrell will be working on this project for an entire month. Lets start pulling a to-do list together, so we can make sure her time is used effectively.

Add prioritizr logo to README

It would be nice to have prioritizr logo should be added to the package's README. For example, as shown in dplyr's README.

add_max_phylo_objective [documentation]

This equation and maths interpretation of this objective function as described in the details section of the documentation is incorrect. Note that the objective function works as described in the first paragraph of the function's help page. I'm just making a note that it needs updating so I don't forget.

publication record [documentation proposal]

Hi,

I was wondering if it would be worth keeping a record of all the publications that use prioritizr? This could be a useful resource for users, and for promotional purposes in the future.

If you think it would be useful, how do you think this record should be organised? Should it be a vignette available in the package? This has the advantage of it easily being incorporated into the package's website. A wiki page on the GitHub repository? This has the advantage of being more easily editable by other people.

Thoughts?

solution output

A small addition to solution outputs that would be great to have in addition to (usually) the solution raster, would be parameters that are returned by the solvers, such as runtime, status (i.e. did the solver reach an optimal solution) and cost. At least for me that's very useful information to have.

What is new in Version 3.0.3?

Dear prioritizr developers
I found this package recently updated to 3.0.3, but I couldn't find the change list applied in this version.
Could you please tell me about changes?
I hope some of the changes be relevant to speeding up this package (using free optimization cores) because now it is 7 days which my laptop is working to find a solution for 50000 PUs using Rsymphony in Linux.

problem S3 method for data.frame

When data.frames are supplied to the problem function and rij does not include the max value of x$id an error is generated after solve is run:

Error in $<-.data.frame(*tmp*, solution, value = c(0, 0, 0, 0, 0, : replacement has XX rows, data has YY

maxcovr: maximum covering location problem

This looks like a great package, Jeff!

I've been writing a package, maxcovr, which helps solve the maximal covering location problem. Not sure if that could be useful to you, but just thought I'd let you know in case you want to use code or import things from there.

On a somewhat related note, is it me, or is there not a dedicated modern R package for calculating distances between points and other distance related tasks ?

planning unit importance [feature proposal]

It seems like a lot of people like some measure of importance for each planning unit in a solution (eg. selection frequency from Marxan, or ranking from Zonation). So I wonder if we should try adding some functionality to calculate "planning unit priority" or "planning unit importance" given a solution? I remember there was some talk of this on the Marxan emailing list a while ago.

Do people think this would be useful? What algorithms/metrics should be used? Ideally the algorithm/metric would be problem invariant, so it would be useful no matter what constraints/objectives/decisions were used.

Long problem compiling times

A PhD student at CBCS has been trying out this package for some of their work, and they're reporting that it takes 5 hours to compile the problem with Raster-class planning units and RasterStack-class features. The data set contains 17'000 features and over 50'000 planning units. After compiling the problem, Gurobi takes 25 seconds to solve the problem.

These run times are unacceptable, so I'm going to try and get to the bottom of this today. Has anyone experienced these issues?

r-hub builder prioritizr 3.0.1: ERROR

Build ID: prioritizr_3.0.1.tar.gz-610e8ddbf5f4e08d4e406bb25a2120e8
Platform: Debian Linux, R-devel, GCC
Submitted: 27 minutes 55.3 seconds ago
Build time: 26 minutes 59.6 seconds
ERRORS:

checking tests ...
Running ‘testthat.R’
ERROR
Running the tests in ‘tests/testthat.R’ failed.
Last 13 lines of output:
- Optimal Solution Found *
Solution Found: Node 0, Level 0
Solution Cost: 4.0000000000
list()
attr(,"class")
[1] "Waiver"
testthat results ================================================================
OK: 8428 SKIPPED: 47 FAILED: 1
1. Failure: solution (compressed formulation) (@test_add_max_cover_objective.R#48)
Error: testthat unit tests failed
Execution halted

NOTES:

checking package dependencies ... NOTE
Packages suggested but not available for checking: ‘gurobi’ ‘Rsymphony’
checking installed package size ... NOTE
installed size is 6.2Mb
sub-directories of 1Mb or more:
doc 3.4Mb
libs 1.7Mb

Complete log

New package for ILP optimization ompr

Just saw this new R package ompr, which offers a nice unified interface to solver MILP problems using various solvers in R. Unfortunately, doesn't have bindings to Gurobi, so probably not useful for us, but maybe something to keep an eye.

https://github.com/dirkschumacher/ompr

Unsuccessful installation on Linux ubuntu

Dear users and developers
After extremely slow function on windows, I decided to run the package on Linux. When I try to install the package using:
if (!require(devtools)) install.packages("devtools") devtools::install_github("prioritizr/prioritizr")
I received an error:

"ERROR:
configuration failed for package ‘git2r’
-removing ‘/home/iman/R/x86_64-pc-linux-gnu-library/3.2/git2r’
Warning in install.packages :
installation of package ‘git2r’ had non-zero exit status
ERROR: dependencies ‘curl’, ‘openssl’ are not available for package ‘httr’
-removing ‘/home/iman/R/x86_64-pc-linux-gnu-library/3.2/httr’
Warning in install.packages :
installation of package ‘httr’ had non-zero exit status
ERROR: dependencies ‘httr’, ‘git2r’ are not available for package ‘devtools’
-removing ‘/home/iman/R/x86_64-pc-linux-gnu-library/3.2/devtools’
Warning in install.packages :
installation of package ‘devtools’ had non-zero exit status
The downloaded source packages are in
‘/tmp/Rtmp9hJtFf/downloaded_packages’
Warning message:
In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE, :
there is no package called ‘devtools’
devtools::install_github("prioritizr/prioritizr")
Error in loadNamespace(name) : there is no package called ‘devtools’

How can I install the prioritizr in Linux ubuntu? It is not available in CRAN.

velox package dependency

Should
import(velox)
be added to NAMESPACE?

p1 <- problem(sim_pu_polygons, features = sim_features, cost_name = "cost") %>%

add_min_set_objective() %>%
add_relative_targets(0.2) %>%
add_binary_decisions()
Loading required namespace: velox
Failed with error: ‘there is no package called ‘velox’’

marxan cloud

Any takers?

Sometimes Nectar regions go out without warning, causing outages on our website.

Volunteers needed. The more the merrier. Monitor, restart, troubleshoot Marxan web (completely redevelop in your own image if you like also).

I'm using OpenStack on Nectar Cloud nectar.org.au , so anyone with an AAF credential (any technically literate student/postdoc/prof/etc affiliated with an account from an Australian University or AAF accredited org like CSIRO) can be added to the mix so they can monitor Marxan infrastructure and restart/etc if required. AAF is required so I can add them to the Nectar dashboard as collaborators on Marxan web - unfortunately for now we can't include our overseas colleagues as admins on Marxan cloud.

package authors and citation

Hi,

I think it would be good to sort out the proper citation for the package and it's authors. To my knowledge, I've included everyone whose involved with the project in the citation section of the readme. I've copied it below.

Hanson JO, Schuster R, Strimas-Mackey M, Arcese P, Bennett J (2017). prioritizr: Systematic 
Conservation Prioritization in R. R package version 1.0.0. 
https://github.com/prioritizr/prioritizr.

What does everyone think? Is anyone missing? Is the ordering correct?

Package tests fail on Solaris

The package tests fail on Solaris (https://cran.r-project.org/web/checks/check_results_prioritizr.html), so we should probably resolve this before we get an email from a CRAN maintainer. I'm working on this now, I'll let you know when I've resolved this.

Align constraints in ConservationProblem$print()

At the moment, printing ConservationProblem objects with multiple constraints looks a bit ugly because the multiple constraints are not aligned in the console/terminal.

Problem with Makevars when installing on Mac

Installing with devtools fails with the following error on macOS Sierra:

error: /Library/Developer/CommandLineTools/usr/bin/strip: unrecognized option: --strip-debug
Usage: /Library/Developer/CommandLineTools/usr/bin/strip [-AnuSXx] [-] [-d filename] [-s filename] [-R filename] [-o output] file [...] 
make: *** [strippedLib] Error 1
ERROR: compilation failed for package ‘prioritizr’
* removing ‘/Library/Frameworks/R.framework/Versions/3.4/Resources/library/prioritizr’
* restoring previous ‘/Library/Frameworks/R.framework/Versions/3.4/Resources/library/prioritizr’
Installation failed: Command failed (1)

Removing the following from Makevars (as in this fork) fixed the issue:

# Compact shared object libraries: http://dirk.eddelbuettel.com/blog/2017/08/14#009_compact_shared_libraries
strippedLib: $(SHLIB)
	if test -e "/usr/bin/strip"; then /usr/bin/strip --strip-debug $(SHLIB); fi

.phony: strippedLib

add_min_size_constraints [feature proposal]

I propose adding a add_min_size_constraints function that would allow users to specify the minimum size a reserve can be to represent a particular feature. I think these constraints could be very useful when there is information on home range size or the size of population extents, though perhaps this might be only a small fraction of actual conservation planning scenarios.

The function could look something like this:

#' Add minimum size constraints
#'
#' This function adds constraints to ensure that reserves used to represent each feature are above
#' a particular size.
#'
#' @param x \code{\link{ConservationProblem}} object.
#'
#' @param size \code{numeric} minimum reserve size. The argument to \code{size} can be single 
#'      number to ensure that all reserves used to represent each feature are above a given size.
#'      If different minimum sizes are required for each feature than a \code{numeric} vector with 
#'      sizes for each feature can be supplied.
#'
#' @details This function is inspired by T{\'o}th \emph{et al.} (2009).
#'
#' @return \code{\link{ConservationProblem}} with the constraint added to it.
#'
#' @seealso \code{\link{constraints}}.
#'
#' @references
#' T{\'o}th SF, Haight RG, Snyder SA, George S, Miller JR, Gregory MS, Skibbe AM (2009) Reserve 
#' selection with minimum contiguous area restrictions: An application to open space protection 
#' planning in suburban Chicago. \emph{Biological Conservation}, 142: 1617--1627.
#'
#' @export
add_min_size_constraints <- function(x, n) {
   # insert magic here
}

What do people think? Would it be useful in prioritizr?

Create problem formulation, feature requires raster

It seems to me that both the 'standard'

problem(x, features, ...)

function as well as the marxan_problem function require the features to be supplied as rasters.

The initial idea for the marxan_problem function was this:

p <-marxan_problem(pu_polygons, cost=..., features=..., locked_in=..., 
  locked_out=..., penalty=..., edge_factor=....)

where the dots are column names or vectors of values.

I think it would be great to allow for the latter case.

Along the same lines, I think it would also be good if in the standard 'problem' function we would allow to mix spatialdataframe classes with dataframes (as features). That's the way I often use shapfefiles and csv files together. Not sure how common it is, but I would imagine its fairly common.

@jeffreyhanson, how is your availability in the next little while? Do you still have time to tackle these things or do you need to focus on finishing your PhD?

Marxan outputs [feature proposal]

It looks like it would be useful to have the functionality to Marxan compatible outputs from solutions.

How much interest is there in this? If there is a lot of interest, how should it be implemented? One approach might be to have a function marxan_output(problem, solution, output_dir) where problem is a ConservationPlanningProblem, solution is the output RasterLayer, Spatial* object, or data.frame object, and output_dir is a character object reffering to a folder to save the results in. Thoughts?

rij_matrix contains duplicates for large rasters

The rij_matrix() returns duplicate rows for features when the features cannot be loaded in to memory. This might explain why rij_matrix is really slow for large data sets. The offending code is

        m <- plyr::llply(seq_len(raster::nlayers(y)), .parallel = FALSE,
          function(i) {
            m <- matrix(y[included], ncol = 1)
            m[is.na(m)] <- 0
            m <- methods::as(m, "dgCMatrix")
          })

which basically extracts the entire raster data set multiple times, it should instead be more like this:

        m <- plyr::llply(seq_len(raster::nlayers(y)), .parallel = FALSE,
          function(i) {
            m <- matrix(y[[i]][included], ncol = 1)
            m[is.na(m)] <- 0
            m <- methods::as(m, "dgCMatrix")
          })

lpsymphony solver throws an error [bug]

Attempting to solve problems using the lpsymphony solver throws an error.

prioritizrdata on CRAN

The prioritizrdata package will need to be on CRAN before prioritizr can be made available on CRAN.

Logo link broken

Looks like the prioritizr logo file has disappeared. Any chance we could get it back - I'd like to include it in a conference talk I'm giving next week!

Bug in boundary length calculations

There seems to be a bug in the generating shared boundary data for planning units when their vertices (or corners) do not align (ie. not located in close proximity). Note that this does not affect raster planning unit data. If your planning unit data are in a grid then this bug does not affect your results. Additionally, since this bug is related to the generation of boundary data it does not affect prioritisations where boundary data has been supplied by the user manually.

Given the example below, the boundary_matrix function will fail to detect edges between planning units 1-2, and 2-3. It will correctly identify edges between planning units 1-3 .

I would be interested in ideas to fix this. I suppose one strategy would be to implement a pre-processing step that adds in additional vertices to planning unit 2. What does everyone think? Should this bug be fixed prior to CRAN release?

data.table dependency

Running a line such as
p <- marxan_problem("input_sp.dat")

Returns the error:
Error in load_file("PUNAME", x, input_dir) : the "data.table" package needs to be installed to load Marxan data files.

Could you make data.table a dependency, so it gets installed automatically?

(please ignore if it is already, but install happened to fail in this case)

Logo?

Hi all,

I thought it might be nice to have a logo or picture for the repo organization.

Any thoughts?

Error in defining the problem

Dear all
Finally I ran the package for sample data without any problem.
But when I'm trying on my own data, I receive the below error in defining the problem:
"Error in .doRowMax(t(x), narm = na.rm) :
object '_raster_doRowMax' not found"
I have 11 features and 54000 PUs.

Allow marxan_problem to use the INPUTDIR argument in the Marxan param file

The marxan parameter file - usually input.dat - has an INPUTDIR argument for the input directory, but I understand that marxan_problem() is constrained to using a directory directly below the location of the Marxan input file.

Could INPUTDIR and OUTPUTDIR be allowed to specify any paths?

An advantage of allowing any directory to be specified, is that many problems can then be run from the same input data. For large projects with many problem variants, this can be a major saving of space and complexity.

extract or rasterize?

How should the abundance of each feature in each planning unit be calculated when the features are a RasterStack object and the planning units are a SpatialPolygons object?

At the moment, we're essentially using the raster::extract function. The package has a few tricks to speed this up (like parallel processing), and using the velox package's extract function if its installed. However, this can still be very slow for polygons that have complex geometry.

In the marxan package, I did this by rasterizing the planning units and using zonal statistics to calculate abundances. This approach was much faster than raster::extract. I found the best performance using gdalUtils to do the rasterization (though this may no longer be the case). The issue with this approach is that it makes a few assumptions about the data that may not be valid.

Do you think we should stick with the slower extract and tell users that they should rasterize their planning unit data where possible, or use the faster rasterization approach and tell users the limitation of this approach?

Optimization software

Hello All
I tried to get an academic license for Gurobi, but because our institute is a research center and is not a university the company disagreed with my request.
On the other hand, I tried to use Symphony software instead of Gurobi but it seems very very slow for me.
Do you have any suggestions to solve the problem? Do you know any other optimization core to use?

Best regards
Iman

solving with 1 core in Linux

Dear all
In the Linux, the problem created with 100% CPU usage and 8 cores. When the job finished (after 26 h) I received two warnings:
Warning messages:
1: : ... may be used in an incorrect context: ‘.fun(piece, ...)’

2: : ... may be used in an incorrect context: ‘.fun(piece, ...)’

Now I'm trying to solve the created problem. But Rsymphony is using only 1 core. Why in the problem creating step it used 8 cores but in the solving step only one?

Support The R Optimization Infrastructure (ROI)

If you support ROI as a solver interface, users can use a number of different solvers out of the box. Currently plugins for the ILP solvers GLPK, symphony, lpsolve and cplex are on CRAN.

I am also currently working on CBC as a ROI plugin.

There are also two ROI plugins for Gurobi on Github.

MarZone type problems

As I never work with MarZone type problems, I hadn't even thought about needing to implement them in prioritizr. @jeffreyhanson or @mstrimas have you thought about the implementation of this already? If not, we could probably build off code that @mattwatts produced.

Multiple solutions [feature proposal]

I've talked to a few people about the ability to output multiple solutions and it seems like a lot of people want it. Here are some ideas on methods to achieve this:

Implement Bender's cuts (as in the raptr R package). Essentially, the problem is solved, and then additional constraints are added to forbid the solution, and the problem is solved again, and more constraints are added, and etc, until a desired number of solutions are obtained. The problem with this approach is that it is very computationally intensive with no shortcuts. To my knowledge, if the users requires n solutions the problem must be solved n times. In summary, good points: it can be used to obtain n solutions that are nearest to optimality and lets us use ROI (#16); bad points: very slow.
Access the solution pool. Gurobi will store feasible solutions it comes across while solving a problem. These can be retrieved after solving the problem for little computational overhead. In summary, good points: quickly get multiple solutions; bad points: quality of solutions is unknown, requires writing C++ code to interface with Gurobi, and prevents us from using ROI to easily extend range of solvers (#16).
Shuffle solutions prior to solving problems. It appears that simply re-ordering columns in an LP can cause different solutions to be returned. This is relatively easy to implement by randomly re-ordering columns prior to solving. In summary, good points: potentially obtain different solutions, all solutions meet the solution gap, plays nicely with ROI (#16); bad points: no guarantee solutions will be different, very slow as all problems must be pre-solved and then solved.

3a. Shuffle solutions prior to solving problems but after pre-solving problems. Gurobi seems to spend most of the time during the pre-solve step, so avoiding having to repeat this step is most desirable when generating multiple solutions. One potential solution might be to use custom C++ code to (1) presolve the problem, (2) shuffle the columns, (3) solve the shuffled problem, and repeat steps 2-3 n times for n solutions. In summary, good points: all points in 3 but much faster than 3; bad points: idea remains to be tested, will not play nicely with ROI (#16).

Thoughts? Does anyone have any other ideas, or preferences for particular approaches? I suppose with regards to ROI in 2 and 3a, we could just implement a special case for Gurobi but I think the main benefit of using ROI is that we don't have to interface with solvers directly.

sf support

Given that sp will soon be superseded by sf (and that sf is much nicer) we will eventually want to consider allowing planning units to be supplied as sf objects.

prioritizr / gurobi uses only one thread

Dear prioritizr-team,
I'm running a Marxan problem using Gurobi on a Linux VM with 32 cores. I've read that Gurobi should use all available resources automatically, but apparently only one thread is used (as printed in the console and also shown in the system monitor, and thus solving the problem takes a very long time). Is there a way I could specify the number of threads? Below is the code I use (and I've sent the data to Jeff in a previous email a while ago), but the problem also persists when using the examples from the manual/vignette.
Thanks a lot in advance, Sami

mp02 <- marxan_problem(x = pu, spec = spec_02, puvspr = puvspr, bound = bound, blm = 0.3,
asymmetric_connectivity=T, add_gurobi_solver)
ms_02 <- solve(mp02)

Console output:
Optimize a model with 715517 rows, 365092 columns and 1646043 nonzeros
Variable types: 0 continuous, 365092 integer (365092 binary)
Coefficient statistics:
Matrix range [2e-03, 1e+00]
Objective range [6e-05, 1e+03]
Bounds range [1e+00, 1e+00]
RHS range [5e+01, 2e+03]
.....
Explored 1 nodes (704823 simplex iterations) in 42843.74 seconds
Thread count was 1 (of 32 available processors)

testthat version update

I received an automated email notifying me that a new version of the testthat R package will be made available on CRAN on November 13th. Apparently, the new version will change the behaviour of testthat's functions such that code that currently works will fail upon the release of the new version. As a consequence, the unit tests in prioritizr will need to be rewritten to be compatible with the new version of testthat. Also, we should probably hold off on submitting prioritizr to CRAN until the new version of testthat is on CRAN to avoid having to do multiple submission in a short period of time.

Error in (function (..., deparse.level = 1) : node stack overflow

I'm trying to set up a problem with a very large number of features (4775) in a raster stack. There are about 13400 occupied planning units.

This line:
p <- problem(pu_raster, pu_features)
Returns:
Error in (function (..., deparse.level = 1) : node stack overflow
Error during wrapup: node stack overflow

The same problem has worked fine using marxan_problem().

It's hard to debug without the data. I can zip it and place it online for download if one of the devs wants to have a look.

Or alternatively, do you know the maximum dimensions (eg features, PU or their product) which the function can handle?

thanks
Dan

Edge factor error

Dear prioritizr users and developers
I want to test different BLM values. When I tested the below script I received an error: “Error in isTRUE(all(is.finite(edge_factor))) : argument "edge_factor" is missing, with no default”
In the reference webpage (https://prioritizr.github.io/prioritizr/reference/) I didn’t find how to define edge factor.
Could you please help me to solve the problem?

Best regards
Iman

Script:

load data

data(sim_pu_raster, sim_features)

create base problem

p <- problem(sim_pu_raster, sim_features) %>%
add_min_set_objective() %>%
add_relative_targets(0.1) %>%
add_binary_decisions()

define blms

blms <- c(0.001, 0.1, 1, 10)

create a list with the solutions

results <- lapply(blms, function(x) {
p %>% add_boundary_penalties(x) %>% solve()
})

prioritizr / prioritizr Goto Github PK

prioritizr's Introduction

prioritizr

Systematic Conservation Prioritization in R

Installation

Official version

Developmental version

Citation

Usage

Learning resources

Getting help

prioritizr's People

Contributors

Stargazers

Watchers

Forkers

prioritizr's Issues

load data

create base problem

define blms

create a list with the solutions

Recommend Projects

Recommend Topics

Recommend Org