Giter Site home page Giter Site logo

usa-npn / rnpn Goto Github PK

View Code? Open in Web Editor NEW
18.0 8.0 9.0 5.81 MB

R client for the National Phenology Network database API

Home Page: https://rdrr.io/cran/rnpn/

License: Other

R 100.00%
web-api species data rstats r national-phenology-network phenology r-package

rnpn's Introduction

rnpn

Project Status: Active – The project has reached a stable, usable state and is being actively developed. CRAN status R build status

rnpn is an R client for interacting with the USA National Phenology Network data web services. These services include access to a rich set of observer-contributed, point-based phenology records as well as geospatial data products including gridded phenological model and climatological data.

Documentation is available for the National Phenology Network API documentation, which describes the full set of REST services this package wraps.

There is no need for an API key to grab data from the National Phenology Network but users are required to self identify, on an honor system, against requests that may draw upon larger datasets. For functions that require it, simply populate the request_source parameter with your name or the name of your institution.

Installation

CRAN version

install.packages("rnpn")

Development version:

install.packages("devtools")
library('devtools')
devtools::install_github("usa-npn/rnpn")
library('rnpn')

This package has dependencies on both curl and gdal. Some Linux based systems may require additional system dependencies for those required packages, and accordingly this package, to install correctly. For example, on Ubuntu:

sudo apt install libcurl4-openssl-dev
sudo apt install libproj-dev libgdal-dev

The Basics

Many of the functions to search for data require knowing the internal unique identifiers of some of the database entities to filter the data down efficiently. For example, if you want to search by species, then you must know the internal identifier of the species. To get a list of all available species use the following:

species_list <- npn_species()

Similarly, for phenophases:

phenophases <- npn_phenophases()

Getting Observational Data

There are four main functions for accessing observational data, at various levels of aggregation. At the most basic level you can download the raw status and intensity data.

some_data <- npn_download_status_data(request_source='Your Name or Org Here',years=c(2015),species_id=c(35),states=c('AZ','IL'))

Note that through this API, data can only be filtered chronologically by full calendar years. You can specify any number of years in each API call. Also note that request_source is a required parameter and should be populated with your name or the name of the organization you represent. All other parameters are optional but it is highly recommended that you filter your data search further.

Getting Geospatial Data

This package wraps around standard WCS endpoints to facilitate the transfer of raster data. Generally, this package does not focus on interacting with WMS services, although they are available. To get a list of all available data layers, use the following:

layers <- npn_get_layer_details()

You can then use the name of the layers to select and download geospatial data as a raster.

npn_download_geospatial(coverage_id = 'si-x:lilac_leaf_ncep_historic',date='2016-12-31',format='geotiff',output_path='./six-test-raster.tiff')

Example of combined observational and geospatial data

For more details see Vignette VII

What’s Next

Please read and review the vignettes for this package to get further information about the full scope of functionality available.

Acknowledgments

This code was developed, in part, as part of the integrated Pheno-Synthesis Software Suite (PS3). The authors acknowledge funding for this work through NASA’s AIST program (80NSSC17K0582, 80NSSC17K0435, 80NSSC17K0538, and 80GSFC18T0003). The University of Arizona and the USA National Phenology Network’s efforts with this package are supported in part by US Geological Survey (G14AC00405, G18AC00135) and the US Fish and Wildlife Service (F16AC01075 and F19AC00168).

Meta

  • Please report any issues or bugs.
  • License: MIT
  • Get citation information for rnpn in R doing citation(package = 'rnpn')
  • Please note that this package is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

image

rnpn's People

Contributors

aariq avatar alyssarosemartin avatar dlebauer avatar jeffswitzer avatar jeroen avatar maelle avatar npnlee85 avatar sckott avatar stevenysw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rnpn's Issues

What's the status of this package?

I saw this package was updated on CRAN and noticed there was no GitHub release, but it actually seems this repo is not the actual source? I'm a bit lost. :-) (not an urgent question)

npn_download_status_data error

I was extracting status and intensity data via the following script
eastUS0921V3_csv <- npn_download_status_data( request_source = "ERENbiodiversity", years = c(2009:2021), coords = c ( lower_left_lat = 24.10487, lower_left_long = -95.1356, upper_right_lat = 48.73317, upper_right_long = -61.56138), email = "[email protected]", download_path = "eastUS0921_V3.csv")

the data extraction preceded fine until the following message in my console:

Found 2110000 records...closing curl input connection.
Service is currently unavailable. Please try again later!
Warning message:
In readLines(con, n = pagesize, encoding = "UTF-8") :
incomplete final line found on 'https://www.usanpn.org/npn_portal//observations/getObservations.ndjson?'

I know that the last two lines are warnings but not sure what that means?

And i am not sure what the 'service unavailable...' message means. I do have a CSV file saved in my local drive. Still, I would like to know what these error messages are.

Error in getobsspbyday

When I run the example from the help document for getobsspbyday I get an error:

library(devtools) 
install_github("rnpn", "ropensci")
library(rnpn)

# Lookup names
temp <- lookup_names(name = "bird", type = "common")
comnames <- temp[temp$species_id %in% c(357, 359, 1108), "common_name"]

# Get some data
out <- getobsspbyday(speciesid = c(357, 359, 1108), startdate = "2010-04-01", enddate = "2013-09-31")

Error:

Error in list_to_dataframe(res, attr(.data, "split_labels")) : 
  Results must be all atomic, or all data frames

Here's my sessionInfo()

R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] rnpn_0.0.5        rjson_0.2.13      devtools_1.4.1    ggplot2_0.9.3.1  
[5] dplyr_0.1         data.table_1.8.10

loaded via a namespace (and not attached):
 [1] assertthat_0.1     colorspace_1.2-4   dichromat_2.0-0   
 [4] digest_0.6.4       evaluate_0.5.1     grid_3.0.2        
 [7] gtable_0.1.2       httr_0.2           labeling_0.2      
[10] MASS_7.3-29        memoise_0.1        munsell_0.4.2     
[13] parallel_3.0.2     plyr_1.8           proto_0.3-10      
[16] RColorBrewer_1.0-5 Rcpp_0.10.6        RCurl_1.95-4.1    
[19] reshape2_1.2.2     scales_0.2.3       stringr_0.6.2     
[22] tools_3.0.2        whisker_0.3-2   

Release rnpn 1.2.8

Prepare for release:

  • git pull
  • Check current CRAN check results
  • Polish NEWS
  • usethis::use_github_links()
  • urlchecker::url_check()
  • devtools::build_readme()
  • devtools::check(remote = TRUE, manual = TRUE)
  • devtools::check_win_devel()
  • revdepcheck::revdep_check(num_workers = 4)
  • Update cran-comments.md
  • git push

Submit to CRAN:

  • devtools::submit_cran()
  • Approve email

Wait for CRAN...

  • Accepted 🎉
  • usethis::use_github_release()
  • usethis::use_dev_version(push = TRUE)

Error installing on windows machine

Rtools is present

Error in setClass("npn", slots = list(taxa = "data.frame", stations = "data.frame",  : 
  unused argument(s) (slots = list(taxa = "data.frame", stations = "data.frame", phenophase = "data.frame", data = "data.frame"))
Error : unable to load R code in package 'rnpn'
ERROR: lazy loading failed for package 'rnpn'

Service down?

Not sure where to report this, but when I tried using rnpn, it seems the service is down. For example:

> library(rnpn)
> npn_species()
Service is unavailable. Try again later!
# A tibble: 1 × 1
  nodata
  <chr>
1 servicedown

Push first version to CRAN

Need to get this up on CRAN so that spocc can depend on it.

Things to do:

  • Make sure documentation is filled out
  • Make sure tests written
  • other...

Please remove dependencies on **rgdal**, **rgeos**, and/or **maptools**

This package depends on (depends, imports or suggests) raster and one or more of the retiring packages rgdal, rgeos or maptools (https://r-spatial.org/r/2022/04/12/evolution.html). Since raster 3.6.3, all use of external FOSS library functionality has been transferred to terra, making the retiring packages very likely redundant. It would help greatly if you could remove dependencies on the retiring packages as soon as possible.

Add use case examples / unexpected error

This whole package looks pretty amazing.

It seems like it wouldn't be too difficult to replicate some figures from the literature in order to illustrate this package use. For instance, I thought this one might make a splashy example:

http://www.nature.com/nature/journal/v394/n6696/abs/394839b0.html

After looking up the species id for the red lilac successfully, I tried:

out <- getallobssp(speciesid = 35, startdate="1960-01-01", enddate="1994-01-01")

and got this error

Error in data.frame(phenophase_id = "76", phenophase_name = "First leaf",  : 
  arguments imply differing number of rows: 1, 0

(Provided this works, it would be fun to update the original figure with data since 1994 as well...)

Some of Lizzie's papers might be nice potential examples too: https://www.usanpn.org/biblio?f[author]=4604

Error in getallobssp function

Error found by @cboettig

out <- getallobssp(speciesid = 35, startdate="1960-01-01", enddate="1994-01-01")

and got this error

Error in data.frame(phenophase_id = "76", phenophase_name = "First leaf",  : 
  arguments imply differing number of rows: 1, 0

deprecation of observations by day functionality

Any plans to re-introduce something like npn_obsspbyday()?

The dev version reports it's defunct, while the CRAN version yields the following error:

Error in npn_GET(paste0(base(), "observations/getObservationsForSpeciesByDay.json"),  : 
  Not Found (HTTP 404).

Error in npn_allobssp() with rbindlist

Just trying to capture NPN observations for Monarch butterfly (species id 396), but there is an internal error with rbindlist().

Depending on the dates I specify, the details of mismatch of number of columns being rbinded vary...

out = npn_allobssp(396, startdate='2013-01-01', enddate = '2015-03-15')
Error in rbindlist(lapply(tt$station_list, data.frame)) :
Item 2 has 7 columns, inconsistent with item 1 which has 16 columns. If instead you need to fill missing columns, use set argument 'fill' to TRUE.

out = npn_allobssp(396, startdate='2015-01-01', enddate = '2015-12-31')
Error in rbindlist(lapply(tt$station_list, data.frame)) :
Item 2 has 14 columns, inconsistent with item 1 which has 16 columns. If instead you need to fill missing columns, use set argument 'fill' to TRUE.

out = npn_allobssp(396, startdate='2013-01-01', enddate = '2013-12-31')
Error in rbindlist(lapply(tt$station_list, data.frame)) :
Item 2 has 16 columns, inconsistent with item 1 which has 7 columns. If instead you need to fill missing columns, use set argument 'fill' to TRUE.

R stuck in reading npn intensity and status data

I downloaded status and intensity data for 2009-2021 period from NPN web portal and I am having trouble reading it into R. I first tried:

library(readr)

eastUS <- read_csv(file = "webPortalData/status_intensity_observation_data.csv")

The script runs and get stuck for hrs... R does not become non-responsive but makes no progress even after

eta: 0s
in the progress bar

I also tried several other reading functions

library(vroom) eastUS <- vroom(file = "webPortalData/status_intensity_observation_data.csv")
still, no success in reading it

the progress bar is stuck in

indexing status_intensity_observation_data.csv [==================================================================---] 208.91MB/s, eta: 1s

vroom is better at reading large datasets than read_csv. These functions are working fine otherwise for other csv files, it is just the portal downloaded version that gives me this issue. I am trying to compare if I have the same data from both the web portal download and the rnpn package pull. that's why I need this odd way of getting the status and intensity data.

do you recommend a better package for reading the web portal data into R? I am not sure if this issue is due to large file size/memory or issue with reading the csv (some odd field separation issue).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.