bmaitner / rbien Goto Github PK

View Code? Open in Web Editor NEW

40.0 7.0 10.0 15.44 MB

Tools for accessing the Botanical Information and Ecology Network (BIEN) database

Home Page: http://bien.nceas.ucsb.edu/bien/

License: Other

R 14.04% HTML 85.77% CSS 0.19%

r biodiversity ecology botanical plant open-science phylogeny traits range-maps bien

rbien's Introduction

RBIEN

Tools for accessing the Botanical Information and Ecology Network (BIEN) database

News:

BIEN is back up on CRAN.

Installing

To install the development version of BIEN from Github:

devtools::install_github("bmaitner/RBIEN")

rbien's People

Contributors

Stargazers

Watchers

Forkers

naturalis chlorophilia fdbesanto2 achmurzy epletcher foret37 besykes rubenvalpue levisc8

rbien's Issues

How can we match all the traits for each species at each site?

Hi @bmaitner，How can I match all traits for each species per site? I want to explore the scaling relationships among traits using the “BIEN ”package. Can you give an example of the code? thank you.

Return error message if connection is being blocked

It would be useful to have the internal function BIEN_sql return a specific and easily understood error message if the package cannot connect to the database.

One trait name is capitalized in database

All the trait names in BIEN are in full lower-case. However, one has a capital letter "Leaf lamina fracture toughness". This is strange from a user perspective. And of course querying the non-capitalized version gives nothing back:

# Capitalized trait name
BIEN::BIEN_trait_traitbyspecies(
  "Anacardium excelsum", "Leaf lamina fracture toughness"
)
#>   scrubbed_species_binomial                     trait_name trait_value  unit
#> 1       Anacardium excelsum Leaf lamina fracture toughness        1345 J.m-2
#> 2       Anacardium excelsum Leaf lamina fracture toughness         628 J.m-2
#>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 method
#> 1 Leaf fracture toughness was measured with a pair of scissors (Dovo, Germany) to control and direct crack growth (Lucas & Pereira 1990). The scissors, with an included angle of 55Â° and a radius of curvature (sharpness) of 1.6 Âµm, were mounted on a portable universal testing machine (Darvell et al. 1996). We fractured leaves in a transverse cut perpendicular to the midrib. This method, described by Lucas et al. (2001), allows the toughness of individual anatomical features, such as the secondary veins and lamina, to be calculated from a single scissors pass.
#> 2 Leaf fracture toughness was measured with a pair of scissors (Dovo, Germany) to control and direct crack growth (Lucas & Pereira 1990). The scissors, with an included angle of 55Â° and a radius of curvature (sharpness) of 1.6 Âµm, were mounted on a portable universal testing machine (Darvell et al. 1996). We fractured leaves in a transverse cut perpendicular to the midrib. This method, described by Lucas et al. (2001), allows the toughness of individual anatomical features, such as the secondary veins and lamina, to be calculated from a single scissors pass.
#>   latitude longitude elevation_m
#> 1     9.15    -79.85          NA
#> 2     9.15    -79.85          NA
#>                                              url_source project_pi
#> 1 http://datadryad.org/resource/doi:10.5061/dryad.69ph0   Kraft TS
#> 2 http://datadryad.org/resource/doi:10.5061/dryad.69ph0   Kraft TS
#>             project_pi_contact access      id
#> 1 [email protected] public 3754751
#> 2 [email protected] public 3754752

# All lower-case trait name
BIEN::BIEN_trait_traitbyspecies(
  "Anacardium excelsum", "leaf lamina fracture toughness"
)
#>  [1] scrubbed_species_binomial trait_name               
#>  [3] trait_value               unit                     
#>  [5] method                    latitude                 
#>  [7] longitude                 elevation_m              
#>  [9] url_source                project_pi               
#> [11] project_pi_contact        access                   
#> [13] id                       
#> <0 lignes> (ou 'row.names' de longueur nulle)

^{Created on 2022-07-11 by the reprex package (v2.0.1)}

Create a database listing the fields that are returned (optionally and by default) by a given function.

Or can I at least get a printout of what fields are returned by running a function?

Create a database of listing the fields that are returned (optionally and be default) by a given function.

There was a problem with the query...

Hi there! 👋

We're currently on an exciting project where we're attempting to pull some data from BIEN. Specifically, we're looking to run the following command to gather information on the Provence-Alpes-Côte d'Azur region in France:

france_plot <- BIEN_plot_state(country = "France", state = "Provence-Alpes-Côte d'Azur")

However, we've hit a bit of a snag. We encountered this error message:

There was a problem with the query. This is most often due to internet connection issues, but may also be due to other factors such as an outdated version of the package.

Finally, this code
france_plot <- BIEN_plot_country(country = "France") take a huge amount of time (1 hour still running) is it a server issue ?

We've double-checked our internet connection, which seems strong and stable, and we've also made sure that both the package and R are up to date.

Our goal is to delve deep into the Mediterranean region's biodiversity by extracting data on all species within the area, which means we're looking at a broader scope than just a single region in France.

Could you perhaps shed some light on how we might overcome this error? Any tips or suggestions would be greatly appreciated. We're all ears for any ideas or advice you might have to help us navigate through this!

Thanks a bunch! 🌿

Consistency in species name

I've been trying to download a species list, then getting the traits for the list automatically.
However, there is a naming inconsistency in species names between function that gets you the occurrence/ranges of species, and the functions that gets the trait data.

The former need the species name with an underscore, while the latter need the species name separated by a space:

library("BIEN")
#> Warning: le package 'BIEN' a été compilé avec la version R 4.0.3
#> Le chargement a nécessité le package : RPostgreSQL
#> Warning: le package 'RPostgreSQL' a été compilé avec la version R 4.0.3
#> Le chargement a nécessité le package : DBI
#> Warning: le package 'DBI' a été compilé avec la version R 4.0.3
#> Type vignette("BIEN") or vignette("BIEN_tutorial") to get started
#> 

# Ranges work with species names with underscores and without
# But always return species names *with* underscores
BIEN_ranges_load_species("Arnica_ovata")
#> class       : SpatialPolygonsDataFrame 
#> features    : 1 
#> extent      : -173.2246, -104.9879, 30.87895, 67.90291  (xmin, xmax, ymin, ymax)
#> crs         : +proj=longlat +datum=WGS84 +no_defs 
#> variables   : 1
#> names       :      species 
#> value       : Arnica_ovata
BIEN_ranges_load_species("Arnica ovata")
#> class       : SpatialPolygonsDataFrame 
#> features    : 1 
#> extent      : -173.2246, -104.9879, 30.87895, 67.90291  (xmin, xmax, ymin, ymax)
#> crs         : +proj=longlat +datum=WGS84 +no_defs 
#> variables   : 1
#> names       :      species 
#> value       : Arnica_ovata

# Trait functions need species *without* underscores
BIEN_trait_species("Arnica_ovata")
#> data frame with 0 columns and 0 rows
tibble::as_tibble(BIEN_trait_species("Arnica ovata"))
#> # A tibble: 4 x 13
#>   scrubbed_specie~ trait_name trait_value unit  method latitude longitude
#>   <chr>            <chr>      <chr>       <chr> <chr>     <dbl>     <dbl>
#> 1 Arnica ovata     seed mass  1.357       mg    data ~       NA        NA
#> 2 Arnica ovata     whole pla~ Herb        <NA>  Speci~       NA        NA
#> 3 Arnica ovata     seed mass  1.357       mg    data ~       NA        NA
#> 4 Arnica ovata     whole pla~ Herb        <NA>  Speci~       NA        NA
#> # ... with 6 more variables: elevation_m <int>, url_source <chr>,
#> #   project_pi <chr>, project_pi_contact <chr>, access <chr>, id <dbl>

^{Created on 2020-11-20 by the reprex package (v0.3.0)}

Session info

devtools::session_info()
#> - Session info ---------------------------------------------------------------
#>  setting  value                       
#>  version  R version 4.0.2 (2020-06-22)
#>  os       Windows 10 x64              
#>  system   x86_64, mingw32             
#>  ui       RTerm                       
#>  language (EN)                        
#>  collate  French_France.1252          
#>  ctype    French_France.1252          
#>  tz       Europe/Berlin               
#>  date     2020-11-20                  
#> 
#> - Packages -------------------------------------------------------------------
#>  package     * version date       lib source        
#>  ape           5.4-1   2020-08-13 [1] CRAN (R 4.0.3)
#>  assertthat    0.2.1   2019-03-21 [1] CRAN (R 4.0.3)
#>  BIEN        * 1.2.4   2020-02-27 [1] CRAN (R 4.0.3)
#>  callr         3.5.1   2020-10-13 [1] CRAN (R 4.0.3)
#>  class         7.3-17  2020-04-26 [2] CRAN (R 4.0.2)
#>  classInt      0.4-3   2020-04-07 [1] CRAN (R 4.0.3)
#>  cli           2.1.0   2020-10-12 [1] CRAN (R 4.0.3)
#>  codetools     0.2-18  2020-11-04 [1] CRAN (R 4.0.3)
#>  crayon        1.3.4   2017-09-16 [1] CRAN (R 4.0.3)
#>  DBI         * 1.1.0   2019-12-15 [1] CRAN (R 4.0.3)
#>  desc          1.2.0   2018-05-01 [1] CRAN (R 4.0.3)
#>  devtools      2.3.2   2020-09-18 [1] CRAN (R 4.0.3)
#>  digest        0.6.27  2020-10-24 [1] CRAN (R 4.0.3)
#>  doParallel    1.0.16  2020-10-16 [1] CRAN (R 4.0.3)
#>  dplyr         1.0.2   2020-08-18 [1] CRAN (R 4.0.3)
#>  e1071         1.7-4   2020-10-14 [1] CRAN (R 4.0.3)
#>  ellipsis      0.3.1   2020-05-15 [1] CRAN (R 4.0.3)
#>  evaluate      0.14    2019-05-28 [1] CRAN (R 4.0.3)
#>  fansi         0.4.1   2020-01-08 [1] CRAN (R 4.0.3)
#>  fasterize     1.0.3   2020-07-27 [1] CRAN (R 4.0.3)
#>  foreach       1.5.1   2020-10-15 [1] CRAN (R 4.0.3)
#>  fs            1.5.0   2020-07-31 [1] CRAN (R 4.0.3)
#>  generics      0.1.0   2020-10-31 [1] CRAN (R 4.0.3)
#>  glue          1.4.2   2020-08-27 [1] CRAN (R 4.0.3)
#>  highr         0.8     2019-03-20 [1] CRAN (R 4.0.3)
#>  htmltools     0.5.0   2020-06-16 [1] CRAN (R 4.0.3)
#>  iterators     1.0.13  2020-10-15 [1] CRAN (R 4.0.3)
#>  KernSmooth    2.23-18 2020-10-29 [1] CRAN (R 4.0.3)
#>  knitr         1.30    2020-09-22 [1] CRAN (R 4.0.3)
#>  lattice       0.20-41 2020-04-02 [2] CRAN (R 4.0.2)
#>  lifecycle     0.2.0   2020-03-06 [1] CRAN (R 4.0.3)
#>  magrittr      2.0.1   2020-11-17 [1] CRAN (R 4.0.3)
#>  memoise       1.1.0   2017-04-21 [1] CRAN (R 4.0.3)
#>  nlme          3.1-150 2020-10-24 [1] CRAN (R 4.0.3)
#>  pillar        1.4.7   2020-11-20 [1] CRAN (R 4.0.2)
#>  pkgbuild      1.1.0   2020-07-13 [1] CRAN (R 4.0.3)
#>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.0.3)
#>  pkgload       1.1.0   2020-05-29 [1] CRAN (R 4.0.3)
#>  prettyunits   1.1.1   2020-01-24 [1] CRAN (R 4.0.3)
#>  processx      3.4.4   2020-09-03 [1] CRAN (R 4.0.3)
#>  ps            1.4.0   2020-10-07 [1] CRAN (R 4.0.3)
#>  purrr         0.3.4   2020-04-17 [1] CRAN (R 4.0.3)
#>  R6            2.5.0   2020-10-28 [1] CRAN (R 4.0.3)
#>  raster        3.3-13  2020-07-17 [1] CRAN (R 4.0.3)
#>  Rcpp          1.0.5   2020-07-06 [1] CRAN (R 4.0.3)
#>  remotes       2.2.0   2020-07-21 [1] CRAN (R 4.0.3)
#>  rgdal         1.5-18  2020-10-13 [1] CRAN (R 4.0.3)
#>  rgeos         0.5-5   2020-09-07 [1] CRAN (R 4.0.3)
#>  rlang         0.4.8   2020-10-08 [1] CRAN (R 4.0.3)
#>  rmarkdown     2.5     2020-10-21 [1] CRAN (R 4.0.3)
#>  RPostgreSQL * 0.6-2   2017-06-24 [1] CRAN (R 4.0.3)
#>  rprojroot     2.0.2   2020-11-15 [1] CRAN (R 4.0.2)
#>  sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 4.0.3)
#>  sf            0.9-6   2020-09-13 [1] CRAN (R 4.0.3)
#>  sp            1.4-4   2020-10-07 [1] CRAN (R 4.0.3)
#>  stringi       1.5.3   2020-09-09 [1] CRAN (R 4.0.3)
#>  stringr       1.4.0   2019-02-10 [1] CRAN (R 4.0.3)
#>  testthat      3.0.0   2020-10-31 [1] CRAN (R 4.0.3)
#>  tibble        3.0.4   2020-10-12 [1] CRAN (R 4.0.3)
#>  tidyselect    1.1.0   2020-05-11 [1] CRAN (R 4.0.3)
#>  units         0.6-7   2020-06-13 [1] CRAN (R 4.0.3)
#>  usethis       1.6.3   2020-09-17 [1] CRAN (R 4.0.3)
#>  utf8          1.1.4   2018-05-24 [1] CRAN (R 4.0.3)
#>  vctrs         0.3.4   2020-08-29 [1] CRAN (R 4.0.3)
#>  withr         2.3.0   2020-09-22 [1] CRAN (R 4.0.3)
#>  xfun          0.19    2020-10-30 [1] CRAN (R 4.0.3)
#>  yaml          2.2.1   2020-02-01 [1] CRAN (R 4.0.3)
#> 
#> [1] C:/Users/ke76dimu/R/win-library/4.0
#> [2] C:/Program Files/R/R-4.0.2/library

Unable to complete queries

When trying to execute queries on BIEN, the following error message is shown:
There was a problem with the query. This is most often due to internet connection issues, but may also be due other factors such as an outdated version of the package.

The error is reproducible in different systems (Windows and Linux), connections (both cable and wireless) and environments (both RStudio and pure R console). Network connection was monitored during the execution and no instabilities or poor connection were detected. The error affects different functions, the following were tested:
BIEN_occurrence_country()
BIEN_occurrence_state()
BIEN_plot_sf()
all of which with different parameters.

The relevant versions are:
RBIEN 1.2.6
R 4.3.1
RStudio 2023.06.2+561 "Mountain Hydrangea"

Clarify "ID" field in BIEN_traits_traits output

How narrow is the ID field in the BIEN_traits_traits output? Is this an ID number for a given plant at a given site, or is it an ID number for a given trait of a given plant at a given site? Or an ID for a given project?

I'm looking to match multiple trait measurements for a single plant (for example, specific leaf area and leaf N content) across multiple sites. It would be very helpful indeed if the ID was a unique number for each plant.

This package is amazing and impressively powerful. I wish all trait databases had such great access. Thank you for all your time and energy to put this together!

All the best,
Hilary Rose

Feature Request: A trait table conversion between BIEN traits and TRY trait IDs

Hi @bmaitner 👋

For an ongoing project I'm trying to gather all traits available in both TRY and BIEN for a subset of species.
One thing that I have to work on is to match the trait names in BIEN to trait IDs in TRY.
I actually think that many people are trying to complement both databases (see for example this GitHub search https://github.com/search?q=BIEN+TRY+traits&type=code).
It would be great to have a table that gives you directly the correspondence between BIEN trait names and TRY trait IDs.

Do you have this internally somewhere? Otherwise I'm working on it on my side and may contribute it if you're interested.

Species

Is there a list of species somewhere with the correct way to type them into BIEN? When I get the list of all species most are cut out and the species I type in either aren't in the data set or are typed in incorrectly.

BIEN_trait_mean errors due to conflicting family/genus designations

The function BIEN_trait_mean will sometimes gives errors due to the same genus being assigned to different families (e.g. "Unknown" and somethingaceae"). Thankfully the errors don't stop the function, and the NA's that are returned are reasonable (and better than erroneous values).

Revising code to only use families ending in "aceae"

Add links to trait definitions to trait table

Bulk download speed-up

I was trying to download thousands of range maps using BIEN_ranges_species(), and it took over a day to reach 10,000 (without parallelization). Therefore, I had a look at your code.

When dealing with multiple species, you are currently iterating through each and returning a single ESRI shapefile. Instead, I would suggest using the output of .BIEN_sql() with st_as_sf() from the sf package. This require a column with WKT data, which is returned by .BIEN_sql(). Doing so translates the full query into a single sf object, greatly limiting the overhead time.

Using this approach, it took me half a day to download the full set of range maps (> 98,000).

Species occurrence download from a list does not work

Hi, I am trying to download the BIEN occurrence for a species list, but it does not download the data. Below is an example of what I tried and the error message I got:

here is the species name list I take to download the occurrences

spcs <- df %>% pull(scrubbed_species_binomial)

here I tried to download all the occurrences of those species restring the occurrence to the New World

spocc <- BIEN_occurrence_species(species = spcs, new.world = T)

here is the error message I got

There was a problem with the query. This is most often due to internet connection issues, but may also be due other factors >such as an outdated version of the package.

No species occurrence is downloaded after the code run. As my internet connection is quite good, and the BIEN package version I am using is the most recent one (BIEN 1.2.5), I don’t know why this is happening.
I have tried to download these occurrences several times across the last 3 to 4 days, but always with success.
Thanks

occurrence records

Is there an argument for this or another function that can get me the collector/observer's name (and collection number) for a range/specific record? Or can I modify the code? Or can I at least get a printout of what fields are returned by running a function?

Error in postgresqlNewConnection

I'm trying to use the R package BIEN to retrieve information of species traits.

However, I'm getting the following error message:

Error in postgresqlNewConnection(drv, ...) :
RS-DBI driver: (could not connect [email protected]:5432 on dbname "public_vegbien": could not connect to server: Connection timed out (0x0000274C/10060)
Is the server running on host "vegbiendev.nceas.ucsb.edu" (128.111.84.31) and accepting
TCP/IP connections on port 5432?

Could you kindly let me know how to solve this problem?

Thank you very much,

Can't download plot data with specific sampling protocol

Hey @bmaitner,
I'm trying to work with BIEN plot data. As mentioned by the tutorial it is better to work with plots acquired through similar sampling protocol to be comparable. However, when trying to download from a specific protocol I get no results for most of them.
Here's the reprex:

all_protocols = BIEN::BIEN_plot_list_sampling_protocols()

# "1 ha tree plot, stems >= 10 cm dbh"
BIEN::BIEN_plot_sampling_protocol(all_protocols[3, 1])
#> data frame with 0 columns and 0 rows

# "Carolina Vegetation Survey Standard Sampling Method"
BIEN::BIEN_plot_sampling_protocol(all_protocols[4, 1])
#> data frame with 0 columns and 0 rows

# "Center for Tropical Forest Science Forest Inventory  Protocol"
BIEN::BIEN_plot_sampling_protocol(all_protocols[5, 1])
#> data frame with 0 columns and 0 rows

# "TEAM Forest Inventory Sampling Protocol"
BIEN::BIEN_plot_sampling_protocol(all_protocols[7, 1])
#> data frame with 0 columns and 0 rows

# "US Forest Inventory and Analysis Sampling Protocol"
BIEN::BIEN_plot_sampling_protocol(all_protocols[8, 1])
#> data frame with 0 columns and 0 rows

^{Created on 2020-11-10 by the reprex package (v0.3.0)}

Session info

devtools::session_info()
#> - Session info ---------------------------------------------------------------
#>  setting  value                       
#>  version  R version 4.0.2 (2020-06-22)
#>  os       Windows 10 x64              
#>  system   x86_64, mingw32             
#>  ui       RTerm                       
#>  language (EN)                        
#>  collate  French_France.1252          
#>  ctype    French_France.1252          
#>  tz       Europe/Berlin               
#>  date     2020-11-10                  
#> 
#> - Packages -------------------------------------------------------------------
#>  package     * version date       lib source        
#>  ape           5.4-1   2020-08-13 [1] CRAN (R 4.0.3)
#>  assertthat    0.2.1   2019-03-21 [1] CRAN (R 4.0.3)
#>  backports     1.2.0   2020-11-02 [1] CRAN (R 4.0.2)
#>  BIEN          1.2.4   2020-02-27 [1] CRAN (R 4.0.3)
#>  callr         3.5.1   2020-10-13 [1] CRAN (R 4.0.3)
#>  class         7.3-17  2020-04-26 [2] CRAN (R 4.0.2)
#>  classInt      0.4-3   2020-04-07 [1] CRAN (R 4.0.3)
#>  cli           2.1.0   2020-10-12 [1] CRAN (R 4.0.3)
#>  codetools     0.2-18  2020-11-04 [1] CRAN (R 4.0.2)
#>  crayon        1.3.4   2017-09-16 [1] CRAN (R 4.0.3)
#>  DBI           1.1.0   2019-12-15 [1] CRAN (R 4.0.3)
#>  desc          1.2.0   2018-05-01 [1] CRAN (R 4.0.3)
#>  devtools      2.3.2   2020-09-18 [1] CRAN (R 4.0.3)
#>  digest        0.6.27  2020-10-24 [1] CRAN (R 4.0.3)
#>  doParallel    1.0.16  2020-10-16 [1] CRAN (R 4.0.3)
#>  dplyr         1.0.2   2020-08-18 [1] CRAN (R 4.0.3)
#>  e1071         1.7-4   2020-10-14 [1] CRAN (R 4.0.3)
#>  ellipsis      0.3.1   2020-05-15 [1] CRAN (R 4.0.3)
#>  evaluate      0.14    2019-05-28 [1] CRAN (R 4.0.3)
#>  fansi         0.4.1   2020-01-08 [1] CRAN (R 4.0.3)
#>  fasterize     1.0.3   2020-07-27 [1] CRAN (R 4.0.3)
#>  foreach       1.5.1   2020-10-15 [1] CRAN (R 4.0.3)
#>  fs            1.5.0   2020-07-31 [1] CRAN (R 4.0.3)
#>  generics      0.1.0   2020-10-31 [1] CRAN (R 4.0.3)
#>  glue          1.4.2   2020-08-27 [1] CRAN (R 4.0.3)
#>  highr         0.8     2019-03-20 [1] CRAN (R 4.0.3)
#>  htmltools     0.5.0   2020-06-16 [1] CRAN (R 4.0.3)
#>  iterators     1.0.13  2020-10-15 [1] CRAN (R 4.0.3)
#>  KernSmooth    2.23-18 2020-10-29 [1] CRAN (R 4.0.3)
#>  knitr         1.30    2020-09-22 [1] CRAN (R 4.0.3)
#>  lattice       0.20-41 2020-04-02 [2] CRAN (R 4.0.2)
#>  lifecycle     0.2.0   2020-03-06 [1] CRAN (R 4.0.3)
#>  magrittr      1.5     2014-11-22 [1] CRAN (R 4.0.3)
#>  memoise       1.1.0   2017-04-21 [1] CRAN (R 4.0.3)
#>  nlme          3.1-150 2020-10-24 [1] CRAN (R 4.0.3)
#>  pillar        1.4.6   2020-07-10 [1] CRAN (R 4.0.3)
#>  pkgbuild      1.1.0   2020-07-13 [1] CRAN (R 4.0.3)
#>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.0.3)
#>  pkgload       1.1.0   2020-05-29 [1] CRAN (R 4.0.3)
#>  prettyunits   1.1.1   2020-01-24 [1] CRAN (R 4.0.3)
#>  processx      3.4.4   2020-09-03 [1] CRAN (R 4.0.3)
#>  ps            1.4.0   2020-10-07 [1] CRAN (R 4.0.3)
#>  purrr         0.3.4   2020-04-17 [1] CRAN (R 4.0.3)
#>  R6            2.5.0   2020-10-28 [1] CRAN (R 4.0.3)
#>  raster        3.3-13  2020-07-17 [1] CRAN (R 4.0.3)
#>  Rcpp          1.0.5   2020-07-06 [1] CRAN (R 4.0.3)
#>  remotes       2.2.0   2020-07-21 [1] CRAN (R 4.0.3)
#>  rgeos         0.5-5   2020-09-07 [1] CRAN (R 4.0.3)
#>  rlang         0.4.8   2020-10-08 [1] CRAN (R 4.0.3)
#>  rmarkdown     2.5     2020-10-21 [1] CRAN (R 4.0.3)
#>  RPostgreSQL   0.6-2   2017-06-24 [1] CRAN (R 4.0.3)
#>  rprojroot     1.3-2   2018-01-03 [1] CRAN (R 4.0.3)
#>  sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 4.0.3)
#>  sf            0.9-6   2020-09-13 [1] CRAN (R 4.0.3)
#>  sp            1.4-4   2020-10-07 [1] CRAN (R 4.0.3)
#>  stringi       1.5.3   2020-09-09 [1] CRAN (R 4.0.3)
#>  stringr       1.4.0   2019-02-10 [1] CRAN (R 4.0.3)
#>  testthat      3.0.0   2020-10-31 [1] CRAN (R 4.0.3)
#>  tibble        3.0.4   2020-10-12 [1] CRAN (R 4.0.3)
#>  tidyselect    1.1.0   2020-05-11 [1] CRAN (R 4.0.3)
#>  units         0.6-7   2020-06-13 [1] CRAN (R 4.0.3)
#>  usethis       1.6.3   2020-09-17 [1] CRAN (R 4.0.3)
#>  vctrs         0.3.4   2020-08-29 [1] CRAN (R 4.0.3)
#>  withr         2.3.0   2020-09-22 [1] CRAN (R 4.0.3)
#>  xfun          0.19    2020-10-30 [1] CRAN (R 4.0.3)
#>  yaml          2.2.1   2020-02-01 [1] CRAN (R 4.0.3)
#> 
#> [1] C:/Users/ke76dimu/Documents/R/win-library/4.0
#> [2] C:/Program Files/R/R-4.0.2/library

vignette("BIEN") responds with vignette 'BIEN' not found

Hi, I'm following the 2017 tutorial that's posted in here but when I reach the command

vignette("BIEN")

It responds with vignette 'BIEN' not found

I'm new so I'm not sure if I did something wrong, can someone help me get this going?

Thanks!

BIEN out of CRAN

Hi @bmaitner and dev-team,

We just realized that BIEN was archived in early January by CRAN. Are there plans to put it back soon?

I hope it is an easy fix. Let me know if I can help with something.

Best,
Gonzalo

Some species names are not capitalized

When retrieving species list from their range, I realized by matching the species list to another one, that some species names were not capitalized.
Here is a reprex:

library("dplyr")
#> Warning: le package 'dplyr' a été compilé avec la version R 4.0.3
#> 
#> Attachement du package : 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

BIEN::BIEN_ranges_box(
  min.lat = -55.61831, max.lat = 83.64513,
  min.long = -171.79111, max.long = -12.20855,
  species.names.only = TRUE) %>%
  filter(substr(species, 1, 1) %in% letters)
#>                  species
#> 1  chamomilla_chamomilla
#> 2          dubius_dubius
#> 3           dubius_subsp
#> 4       lachenalii_subsp
#> 5    polytaenium_jenmani
#> 6    syngonanthus_nitens
#> 7             x_Catyclia
#> 8          x_Agrohordeum
#> 9            x_Agropogon
#> 10          x_Elyhordeum
#> 11         x_Elysitanion
#> 12         x_Festulolium
#> 13         x_Pseudelymus
#> 14 xPseudelymus_saxicola
#> 15       x_Stiporyzopsis
#> 16  anthurium_geherrerae

^{Created on 2020-11-24 by the reprex package (v0.3.0)}

Session info

devtools::session_info()
#> - Session info ---------------------------------------------------------------
#>  setting  value                       
#>  version  R version 4.0.2 (2020-06-22)
#>  os       Windows 10 x64              
#>  system   x86_64, mingw32             
#>  ui       RTerm                       
#>  language (EN)                        
#>  collate  French_France.1252          
#>  ctype    French_France.1252          
#>  tz       Europe/Berlin               
#>  date     2020-11-24                  
#> 
#> - Packages -------------------------------------------------------------------
#>  package     * version date       lib source        
#>  ape           5.4-1   2020-08-13 [1] CRAN (R 4.0.3)
#>  assertthat    0.2.1   2019-03-21 [1] CRAN (R 4.0.3)
#>  BIEN          1.2.4   2020-02-27 [1] CRAN (R 4.0.3)
#>  callr         3.5.1   2020-10-13 [1] CRAN (R 4.0.3)
#>  class         7.3-17  2020-04-26 [2] CRAN (R 4.0.2)
#>  classInt      0.4-3   2020-04-07 [1] CRAN (R 4.0.3)
#>  cli           2.2.0   2020-11-20 [1] CRAN (R 4.0.3)
#>  codetools     0.2-18  2020-11-04 [1] CRAN (R 4.0.3)
#>  crayon        1.3.4   2017-09-16 [1] CRAN (R 4.0.3)
#>  DBI           1.1.0   2019-12-15 [1] CRAN (R 4.0.3)
#>  desc          1.2.0   2018-05-01 [1] CRAN (R 4.0.3)
#>  devtools      2.3.2   2020-09-18 [1] CRAN (R 4.0.3)
#>  digest        0.6.27  2020-10-24 [1] CRAN (R 4.0.3)
#>  doParallel    1.0.16  2020-10-16 [1] CRAN (R 4.0.3)
#>  dplyr       * 1.0.2   2020-08-18 [1] CRAN (R 4.0.3)
#>  e1071         1.7-4   2020-10-14 [1] CRAN (R 4.0.3)
#>  ellipsis      0.3.1   2020-05-15 [1] CRAN (R 4.0.3)
#>  evaluate      0.14    2019-05-28 [1] CRAN (R 4.0.3)
#>  fansi         0.4.1   2020-01-08 [1] CRAN (R 4.0.3)
#>  fasterize     1.0.3   2020-07-27 [1] CRAN (R 4.0.3)
#>  foreach       1.5.1   2020-10-15 [1] CRAN (R 4.0.3)
#>  fs            1.5.0   2020-07-31 [1] CRAN (R 4.0.3)
#>  generics      0.1.0   2020-10-31 [1] CRAN (R 4.0.3)
#>  glue          1.4.2   2020-08-27 [1] CRAN (R 4.0.3)
#>  highr         0.8     2019-03-20 [1] CRAN (R 4.0.3)
#>  htmltools     0.5.0   2020-06-16 [1] CRAN (R 4.0.3)
#>  iterators     1.0.13  2020-10-15 [1] CRAN (R 4.0.3)
#>  KernSmooth    2.23-18 2020-10-29 [1] CRAN (R 4.0.3)
#>  knitr         1.30    2020-09-22 [1] CRAN (R 4.0.3)
#>  lattice       0.20-41 2020-04-02 [2] CRAN (R 4.0.2)
#>  lifecycle     0.2.0   2020-03-06 [1] CRAN (R 4.0.3)
#>  magrittr      2.0.1   2020-11-17 [1] CRAN (R 4.0.3)
#>  memoise       1.1.0   2017-04-21 [1] CRAN (R 4.0.3)
#>  nlme          3.1-150 2020-10-24 [1] CRAN (R 4.0.3)
#>  pillar        1.4.7   2020-11-20 [1] CRAN (R 4.0.2)
#>  pkgbuild      1.1.0   2020-07-13 [1] CRAN (R 4.0.3)
#>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.0.3)
#>  pkgload       1.1.0   2020-05-29 [1] CRAN (R 4.0.3)
#>  prettyunits   1.1.1   2020-01-24 [1] CRAN (R 4.0.3)
#>  processx      3.4.4   2020-09-03 [1] CRAN (R 4.0.3)
#>  ps            1.4.0   2020-10-07 [1] CRAN (R 4.0.3)
#>  purrr         0.3.4   2020-04-17 [1] CRAN (R 4.0.3)
#>  R6            2.5.0   2020-10-28 [1] CRAN (R 4.0.3)
#>  raster        3.4-5   2020-11-14 [1] CRAN (R 4.0.3)
#>  Rcpp          1.0.5   2020-07-06 [1] CRAN (R 4.0.3)
#>  remotes       2.2.0   2020-07-21 [1] CRAN (R 4.0.3)
#>  rgeos         0.5-5   2020-09-07 [1] CRAN (R 4.0.3)
#>  rlang         0.4.8   2020-10-08 [1] CRAN (R 4.0.3)
#>  rmarkdown     2.5     2020-10-21 [1] CRAN (R 4.0.3)
#>  RPostgreSQL   0.6-2   2017-06-24 [1] CRAN (R 4.0.3)
#>  rprojroot     2.0.2   2020-11-15 [1] CRAN (R 4.0.2)
#>  sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 4.0.3)
#>  sf            0.9-6   2020-09-13 [1] CRAN (R 4.0.3)
#>  sp            1.4-4   2020-10-07 [1] CRAN (R 4.0.3)
#>  stringi       1.5.3   2020-09-09 [1] CRAN (R 4.0.3)
#>  stringr       1.4.0   2019-02-10 [1] CRAN (R 4.0.3)
#>  testthat      3.0.0   2020-10-31 [1] CRAN (R 4.0.3)
#>  tibble        3.0.4   2020-10-12 [1] CRAN (R 4.0.3)
#>  tidyselect    1.1.0   2020-05-11 [1] CRAN (R 4.0.3)
#>  units         0.6-7   2020-06-13 [1] CRAN (R 4.0.3)
#>  usethis       1.6.3   2020-09-17 [1] CRAN (R 4.0.3)
#>  vctrs         0.3.4   2020-08-29 [1] CRAN (R 4.0.3)
#>  withr         2.3.0   2020-09-22 [1] CRAN (R 4.0.3)
#>  xfun          0.19    2020-10-30 [1] CRAN (R 4.0.3)
#>  yaml          2.2.1   2020-02-01 [1] CRAN (R 4.0.3)
#> 
#> [1] C:/Users/ke76dimu/R/win-library/4.0
#> [2] C:/Program Files/R/R-4.0.2/library

of course for the crossing it is expected that the first letter is not a capital letter, but what about the other species?

Error in an observation of whole plant vegetative phenology

While playing around with publicly available BIEN traits I found an error in one observation of the trait "whole plant vegetative phenology" here is the reprex:

library("dplyr")
#> 
#> Attachement du package : 'dplyr'
#> Les objets suivants sont masqués depuis 'package:stats':
#> 
#>     filter, lag
#> Les objets suivants sont masqués depuis 'package:base':
#> 
#>     intersect, setdiff, setequal, union

# Weird value 'evergreenergreen'
BIEN::BIEN_trait_trait("whole plant vegetative phenology") %>%
  count(trait_value)
#>                       trait_value    n
#> 1                       deciduous 1244
#> 2                       evergreen 5419
#> 3                evergreenergreen    1
#> 4 variable or conflicting reports   42

# Precise record with this issue
BIEN:::.BIEN_sql(
  "SELECT scrubbed_species_binomial, trait_name, trait_value, id FROM agg_traits
  WHERE trait_name='whole plant vegetative phenology' AND id='3900233'")
#>   scrubbed_species_binomial                       trait_name      trait_value
#> 1         Pinus douglasiana whole plant vegetative phenology evergreenergreen
#>        id
#> 1 3900233

^{Created on 2021-11-24 by the reprex package (v2.0.1)}

Maybe you can correct this in a future BIEN release ;)

BIEN_ranges_load_species fails with hardcoded proj4string

I'm trying to download a set of range maps for a tutorial in an R package I'm building, and noticed that BIEN_ranges_load_species() now fails persistently, probably due to an update to sp/sf.

xanthium_strumarium <- BIEN_ranges_load_species(species = "Xanthium strumarium")

Error in CRS(p4s): NA

I tracked down the issue to here. It seems that this format for proj4string's is no longer used(?).

I've come up with a solution for this that replaces the hardcoded string with st_crs(4326)[[2]], but I'm not sure how sustainable that is. I'm happy to create a PR that introduces this temporary fix though.

BIEN_metadata_list_political_names update

The function BIEN_metadata_list_political_names() needs to be updated, as it fails to return countries without 3rd level political divisions. May need to have users specify a political level for which they want standardized political names.

Handling incorrect naming conventions

Would be useful to add automatic correction of common naming issues (most notably replacing underscores with spaces).

markdown formatting error in vignettes

In the BIEN vignette, there should be

a space after the ### to get a heading
a bank line before unordered lists to get the list not text with asterisk

In the BIEN_tutorial vignette

the yaml is malformed - see other vignette for valid yaml
line 39 refers to RBIEN

Can send a pull request if you want

Erroneous DBH values of 5005 or 5000.5

A subset of the DBH values in the trait table are incorrect (DBH records with trait_value = '5005' OR trait_value = '5000.5'). This issue stems from a previous version of the analytical_stem table containing these values for CTFS plots. Presumably this was meant to encode a NA value. The issue has since been corrected in the analytical_stem table but remains in the trait table (which has not been refreshed recently). This issue will be corrected in the next trait table update.

BIEN_trait_mean NA checking

When I query some traits for particular species I get the following warning:

1: In mean(as.numeric(species_i_data[[1]][, 1])) :
NAs introduced by coercion
2: In mean(as.numeric(species_i_data[[1]][, 1])) :
NAs introduced by coercion

and I get NA values returned when there should be means available (based on the listed sample size):

first_species.txt

It looks like there is some kind of NA checking based on the source code for BIEN_trait_mean, but changing the 'mean' function calls to include 'na.rm = TRUE' might resolve this.

BIEN_trait_mean performance

I'm trying to pull as many trait means as possible for the following list of species:
names.txt

Using vectorized versions of BIEN_trait_mean(vector_of_species_names, vector_of_traits) usually crashes my R console. I'm not sure if its on the backend, but returning the list of trait ids by default could be part of the issue. Maybe we could add a flag to optionally add the list of trait IDs? It greatly increases the size of the data frame that gets returned, and it would be nice if it were optional.

So what I'm doing now is querying means one-by-one:
for species in species_list:
for trait in trait_list:
BIEN_trait_mean(species, trait)
rbind(traits, new_trait)
This isn't the 'R' way of doing it but it works quickly - vectorizing a list of 20 species crashes my console.

get the species list (spp) for a given genus

Hello, I am a beginner in R.

I am having trouble getting the list of species (spp) for a particular genus.

spp <- BIEN_occurrence_genus("Aeschynomene", only.new.world = T)

Following error appears.

Error in .BIEN_sql(query, ...) : unused argument (only.new.world = TRUE)

Please, can you help me?

Comparison BIEN4/GBIF

Hey @bmaitner!

Following our conversation a couple of weeks ago, I just take time now to provide a comparison (with example) between BIEN4 and GBIF data, of course using the two relevant R packages. I'll take the sycamore maple (Acer pseudoplatanus) for the illustration, although it's probably irrelevant. Here we go:

BIEN4 occurrence data

Note: This comes from my own records from a few days ago, as BIEN servers seem unresponsive as of today (The BIEN servers are currently undergoing updates and may be slower than usual at present.).

Information about BIEN:

library("BIEN")
BIEN_metadata_database_version()

  db_version db_release_date
1      4.2.5      2021-12-07

Get the data:

acps_bien <- BIEN_occurrence_species("Acer pseudoplatanus", 
    native.status = TRUE, 
    political.boundaries = TRUE)
dim(acps_bien)

[1] 1699   22

Only data after 1990:

acps_bien$date_collected <- lubridate::ymd(acps_bien$date_collected)
acps_bien <- subset(acps_bien, date_collected > lubridate::ymd("1990-01-01"))
dim(acps_bien)

[1] 728  22

Convert to sf class for mapping:

acps_bien <- st_as_sf(acps_bien, coords = c("longitude", "latitude"), remove = FALSE,
    crs = 4326, agr = "constant")
ggplot(data = world) +
    geom_sf(color = gray(.5), fill= "antiquewhite") +
    geom_sf(data = acps_bien, size = .1, alpha = .2, col = "brown3") +
    coord_sf(xlim = c(2.5e6, 7e6), ylim = c(1.3e6, 5.3e6), crs = st_crs(3035)) +
    labs(
        x = "Longitude",
        y = "Latitude",
        title = acps_nom_scient,
        subtitle = "Données BIEN"
    ) +
    theme(
        panel.grid.major = element_line(color = gray(.7),
        linetype = "dashed", size = 0.5),
        panel.background = element_rect(fill = "aliceblue"),
        plot.title = element_text(face = "italic")
    )

GBIF occurrence data and comparison

Prepare the query and download the data:

library("rgbif")
acps_gbif_dl <- occ_download(
    pred("taxonKey", name_backbone(name = "Acer pseudoplatanus", rank = "species")$speciesKey), # Main key
    pred("hasGeospatialIssue", FALSE), # Remove default geospatial issues
    pred("hasCoordinate", TRUE),       # Keep only records with coordinates
    pred("occurrenceStatus","PRESENT"), # Remove absent records
    pred_not(pred_in("basisOfRecord",c("FOSSIL_SPECIMEN","LIVING_SPECIMEN"))), # Remove fossils and living specimens (zoo/botanical garden)
    pred_and( # Between 1990–2020 (both included)
        pred_gte("year", "1990"),
        pred_lte("year", "2020")),
    format = "SIMPLE_CSV"
)
occ_download_wait(acps_gbif_dl)
acps_gbif <- occ_download_get(acps_gbif_dl, path = "Data/gbif-acps/", overwrite = TRUE) |>
    occ_download_import()

Remove non-commercial data and check the resulting data:

acps_gbif <- subset(acps_gbif, license != "CC_BY_NC_4_0")
dim(acps_gbif)

[1] 387557  50

Convert to sf class for mapping:

acps_gbif <- st_as_sf(acps_gbif, coords = c("decimalLongitude", "decimalLatitude"),
    remove = FALSE, crs = 4326, agr = "constant")
ggplot(data = world) +
    geom_sf(color = gray(.5), fill= "antiquewhite") +
    geom_sf(data = acps_gbif, size = .1, alpha = .05, col = "brown3") +
    coord_sf(xlim = c(2.5e6, 7e6), ylim = c(1.3e6, 5.3e6), crs = st_crs(3035)) +
    labs(
        x = "Longitude",
        y = "Latitude",
        title = acps_nom_scient,
        subtitle = "Données GBIF"
    ) +
    theme(
        panel.grid.major = element_line(color = gray(.7),
        linetype = "dashed", size = 0.5),
        panel.background = element_rect(fill = "aliceblue"),
        plot.title = element_text(face = "italic")
    )

Summary

There is a striking difference between the two datasets, even after removing a bunch of data with non-commercial restrictions (728 vs. 387557 records).

Trait values of zero

Some trait values are being displayed as 0 where that doesn't make much sense, and are likely erroneous (e.g. leaf area of 0 or seed mass of 0)

Security

Hi @bmaitner,

Nice job on the package. It's helpful for many people. I was looking at the code and I noticed that the host/user/password were stored in it. You should avoid this because it's a breach on your server. For instance, I was able to connect to other dbs (with psql client). You can fix it by playing with the pg_hba.conf. I know that user roles are well set (allowing read only on tables) but you might encounter SQL injection attacks, etc.

Safer solution requires a web service: you should have a look at https://github.com/begriffs/postgrest which allows you to deploy easily a REST API on the top of your db. Then, your RBIEN package will send requests to the REST API (with httr R package) and not directly address the requests to the db (with RPostgreSQL package). Let me know if you need advises on this.

Cheers,

Add option to return all occurrence records (including those that aren't geovalid)

Add an argument that allows user to return all records for a query, along with the field "is_geovalid".

Strange values for leaf dry mass

When looking at leaf dry mass I again identified weird trait values:

library("dplyr")
BIEN::BIEN_trait_trait("leaf dry mass") %>%
    count(trait_value, sort = TRUE) %>%
    head(5)
#>   trait_value   n
#> 1           * 426
#> 2           0 131
#> 3        0.01  78
#> 4        0.02  70
#> 5        0.04  68

It seems that there are values equal to "*". They all come from this dryad repository.

all "BIEN_taxonomy_*()" queries list class as "Equisetopsida"

I was trying to get taxonomic information about a large number (>3,000) of species across the entire phylogeny, and I noticed that no matter which species I entered into BIEN_taxonomy_species(<species>), the resulting dataframe listed the class as Equisetopsida.
All other information appears to be correct.

Here's an example:

> x <- BIEN_taxonomy_species(c("Arabidopsis thaliana", "Ginkgo biloba", "Sequoiadendron giganteum"))
> x$class
[1] "Equisetopsida" "Equisetopsida" "Equisetopsida" "Equisetopsida"

This is a great resource btw. Thanks so much for developing and maintaining it!

BIEN tutorial

Hi, I am doing the BIEN tutorial posted on this repository.
I am confused onthis part:

bahamas_country <- BIEN_occurrence_country(country = "Bahamas")
length(unique(bahamas_country$scrubbed_species_binomial))
the result is 954.

but then we I run:
Bahamas_species_list<-BIEN_list_country(country = "Bahamas")
View(Bahamas_species_list)
I get 1100 species.
I am confused because I understood that with this function you get the same unique species list as before but ignoring any NA values as explained in the tutorial. Then, how come that I get more species with species list function of the same country?
thanks for any help

bmaitner / rbien Goto Github PK

rbien's Introduction

RBIEN

News:

Installing

rbien's People

Contributors

Stargazers

Watchers

Forkers

rbien's Issues

BIEN4 occurrence data

GBIF occurrence data and comparison

Summary

Recommend Projects

Recommend Topics

Recommend Org