Comments (10)
@sckott advice is to use ...
crul::set_opts(http_version = 2)
I have also run a GitHub actions test that seems to confirm that setting
curlopts=list(http_version=2)
crul::set_opts(http_version = 2)
will fix the issue.
https://github.com/ropensci/rgbif/blob/actions/tests/testthat/test-occ_search.r#L668
https://github.com/ropensci/rgbif/actions/runs/5321113219/jobs/9635790918#step:13:171
After setting
curlopts=list(http_version=2)
I didn't get the stream error anymore (there were other unrelated test fails).
https://github.com/ropensci/rgbif/actions/runs/5322173421/jobs/9638248400#step:13:154
Maybe crul::set_opts(http_version = 2)
should be the default for the package?
from rgbif.
Hi @johnbaums
I ran the test below on github actions (ubuntu, windows, mac os) and still could not replicate the error.
rgbif/tests/testthat/test-occ_download_wait.R
Lines 41 to 68 in 078cb36
https://github.com/ropensci/rgbif/actions/runs/3740478599/jobs/6348901631
I will leave the issue open in case anyone else runs into the same problem.
from rgbif.
The linked issue says this is due to a buggy server, but we are running the latest supported version of Varnish on api.gbif.org, so I think that's unlikely. We haven't see the error reported anywhere else.
The best fix would be general workarounds for a failed HTTP request, i.e. retry after a few seconds.
from rgbif.
Thanks @MattBlissett. I'm routinely using curlopts=list(http_version=2)
now and haven't had any more problems.
from rgbif.
For what it's worth, I'm also getting the same issue as @johnbaums on MacOS with large downloads, e.g.,
occ_download(
pred('taxonKey', 1),
pred_in('basisOfRecord',
c("MACHINE_OBSERVATION", "HUMAN_OBSERVATION")),
pred_in('country', "AU"),
pred('hasGeospatialIssue', "FALSE"),
pred('occurrenceStatus', "PRESENT"),
pred("hasCoordinate", TRUE),
pred_lt("coordinateUncertaintyInMeters",1000),
pred_gte('year', 2010),
format = "SIMPLE_CSV"
)
occ_download_wait('0001197-230810091245214')
#> status: running
#> Error in curl::curl_fetch_memory(x$url$url, handle = x$url$handle) :
#> HTTP/2 stream 3 was not closed cleanly before end of the underlying stream
Though it works again when the download is complete to tell me it's succeeded.
Session info
# R version 4.3.1 (2023-06-16)
# Platform: aarch64-apple-darwin20 (64-bit)
# Running under: macOS Ventura 13.4.1
#
# Matrix products: default
# BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
# LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
#
# locale:
# [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#
# time zone: Australia/Melbourne
# tzcode source: internal
#
# attached base packages:
# [1] stats graphics grDevices utils datasets methods base
#
# other attached packages:
# [1] lubridate_1.9.2 countrycode_1.5.0 terra_1.7-39 readr_2.1.4 rgbif_3.7.7 dplyr_1.1.2
#
# loaded via a namespace (and not attached):
# [1] utf8_1.2.3 generics_0.1.3 xml2_1.3.4 stringi_1.7.12 httpcode_0.3.0 hms_1.1.3 magrittr_2.0.3 timechange_0.2.0 grid_4.3.1
# [10] plyr_1.8.8 jsonlite_1.8.5 whisker_0.4.1 crul_1.4.0 urltools_1.7.3 httr_1.4.6 fansi_1.0.4 scales_1.2.1 oai_0.4.0
# [19] codetools_0.2-19 lazyeval_0.2.2 cli_3.6.1 rlang_1.1.1 crayon_1.5.2 triebeard_0.4.1 bit64_4.0.5 munsell_0.5.0 withr_2.5.0
# [28] tools_4.3.1 parallel_4.3.1 tzdb_0.4.0 colorspace_2.1-0 ggplot2_3.4.2 curl_5.0.1 vctrs_0.6.3 R6_2.5.1 lifecycle_1.0.3
# [37] stringr_1.5.0 bit_4.0.5 vroom_1.6.3 pkgconfig_2.0.3 pillar_1.9.0 gtable_0.3.3 data.table_1.14.8 glue_1.6.2 Rcpp_1.0.10
# [46] tibble_3.2.1 tidyselect_1.2.0 rstudioapi_0.14 compiler_4.3.1
from rgbif.
Hi @johnbaums
I can't replicate the occ_download_wait()
issue.
occ_download(
pred_gt('elevation', 5000),
pred_in('basisOfRecord', c('HUMAN_OBSERVATION', 'OBSERVATION', 'MACHINE_OBSERVATION')),
pred('country', 'US'),
pred('hasCoordinate', TRUE),
pred('hasGeospatialIssue', FALSE),
pred_gte('year', 1999),
pred_lte('year', 2011),
pred_gte('month', 3),
pred_lte('month', 8)
)
occ_download_wait('0228133-220831081235567')
# status: running
# status: succeeded
# download is done, status: succeeded
> sessioninfo::session_info()
─ Session info ──────────────────────────────────────
setting value
version R version 4.2.2 (2022-10-31 ucrt)
os Windows 10 x64 (build 19045)
system x86_64, mingw32
ui RStudio
language (EN)
collate Danish_Denmark.utf8
ctype Danish_Denmark.utf8
tz Europe/Paris
date 2022-12-19
rstudio 2022.07.2+576 Spotted Wakerobin (desktop)
pandoc NA
─ Packages ──────────────────────────────────────────
package * version date (UTC) lib source
assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.2.2)
bit 4.0.4 2020-08-04 [1] CRAN (R 4.2.2)
bit64 4.0.5 2020-08-30 [1] CRAN (R 4.2.2)
blob 1.2.3 2022-04-10 [1] CRAN (R 4.2.2)
cachem 1.0.6 2021-08-19 [1] CRAN (R 4.2.2)
callr 3.7.3 2022-11-02 [1] CRAN (R 4.2.2)
cli 3.4.1 2022-09-23 [1] CRAN (R 4.2.2)
colorspace 2.0-3 2022-02-21 [1] CRAN (R 4.2.2)
conditionz 0.1.0 2019-04-24 [1] CRAN (R 4.2.2)
crancache 0.0.0.9001 2022-12-06 [1] Github (r-lib/crancache@7ea4e47)
cranlike 1.0.2 2018-11-26 [1] CRAN (R 4.2.2)
crayon 1.5.2 2022-09-29 [1] CRAN (R 4.2.2)
curl 4.3.3 2022-10-06 [1] CRAN (R 4.2.2)
data.table 1.14.4 2022-10-17 [1] CRAN (R 4.2.2)
DBI 1.1.3 2022-06-18 [1] CRAN (R 4.2.2)
debugme 1.1.0 2017-10-22 [1] CRAN (R 4.2.2)
desc 1.4.2 2022-09-08 [1] CRAN (R 4.2.2)
digest 0.6.30 2022-10-18 [1] CRAN (R 4.2.2)
dplyr * 1.0.10 2022-09-01 [1] CRAN (R 4.2.2)
fansi 1.0.3 2022-03-24 [1] CRAN (R 4.2.2)
fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.2.2)
generics 0.1.3 2022-07-05 [1] CRAN (R 4.2.2)
ggplot2 3.4.0 2022-11-04 [1] CRAN (R 4.2.2)
glue 1.6.2 2022-02-24 [1] CRAN (R 4.2.2)
gtable 0.3.1 2022-09-01 [1] CRAN (R 4.2.2)
httr 1.4.4 2022-08-17 [1] CRAN (R 4.2.2)
jsonlite 1.8.3 2022-10-21 [1] CRAN (R 4.2.2)
lazyeval 0.2.2 2019-03-15 [1] CRAN (R 4.2.2)
lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.2.2)
magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.2.2)
memoise 2.0.1 2021-11-26 [1] CRAN (R 4.2.2)
munsell 0.5.0 2018-06-12 [1] CRAN (R 4.2.2)
oai 0.4.0 2022-11-10 [1] CRAN (R 4.2.2)
parsedate 1.3.1 2022-10-27 [1] CRAN (R 4.2.2)
pillar 1.8.1 2022-08-19 [1] CRAN (R 4.2.2)
pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.2.2)
plyr 1.8.8 2022-11-11 [1] CRAN (R 4.2.2)
processx 3.8.0 2022-10-26 [1] CRAN (R 4.2.2)
ps 1.7.2 2022-10-26 [1] CRAN (R 4.2.2)
purrr * 0.3.5 2022-10-06 [1] CRAN (R 4.2.2)
R6 2.5.1 2021-08-19 [1] CRAN (R 4.2.2)
rappdirs 0.3.3 2021-01-31 [1] CRAN (R 4.2.2)
Rcpp 1.0.9 2022-07-08 [1] CRAN (R 4.2.2)
rematch2 2.1.2 2020-05-01 [1] CRAN (R 4.2.2)
remotes 2.4.2 2021-11-30 [1] CRAN (R 4.2.2)
rgbif * 3.7.4 2022-12-06 [1] CRAN (R 4.2.2)
rlang 1.0.6 2022-09-24 [1] CRAN (R 4.2.2)
rprojroot 2.0.3 2022-04-02 [1] CRAN (R 4.2.2)
RSQLite 2.2.19 2022-11-24 [1] CRAN (R 4.2.2)
rstudioapi 0.14 2022-08-22 [1] CRAN (R 4.2.2)
scales 1.2.1 2022-08-20 [1] CRAN (R 4.2.2)
sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.2.2)
stringi 1.7.8 2022-07-11 [1] CRAN (R 4.2.1)
stringr 1.4.1 2022-08-20 [1] CRAN (R 4.2.2)
tibble 3.1.8 2022-07-22 [1] CRAN (R 4.2.2)
tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.2.2)
utf8 1.2.2 2021-07-24 [1] CRAN (R 4.2.2)
uuid 1.1-0 2022-04-19 [1] CRAN (R 4.2.0)
vctrs 0.5.0 2022-10-22 [1] CRAN (R 4.2.2)
whisker 0.4 2019-08-28 [1] CRAN (R 4.2.2)
xml2 1.3.3 2021-11-30 [1] CRAN (R 4.2.2)
[1] C:/Users/ftw712/AppData/Local/Programs/R/R-4.2.2/library
─────────────────────────────────────────────────────
from rgbif.
Thanks for testing, @jhnwllr.
I've now tested a few more machines. It seems to work as expected on Windows, but I consistently run into the stream x was not closed cleanly
error on MacOS.
from rgbif.
Also occurs on Debian, tested with the r-base:4.2.2
docker image.
from rgbif.
I also ran into same issue today with occ_download()
and occ_dowload_wait()
. Used suggestion from @johnbaums to work around it.
from rgbif.
This looks to be intermittent and likely to relate to something outside of our control e.g. dropped packets on long-running connections causing streams to not gracefully reopen. As a pragmatic short to mid-term solution, I wonder if we should just default to using HTTP 1 here by setting the curlopts=list(http_version=2)
. It is, after all, just a simple polling at once per 3 sec so I don't think http2 will bring any benefit.
Note the occcite
tests have also failed with this
from rgbif.
Related Issues (20)
- check_inputs function in zzz.R seems to mangle valid inputs HOT 1
- Why occ_download return a CSV file contains only one column and different rows? HOT 5
- support occ_download eraseAfter HOT 1
- Add support for extension downloads
- support gbifid downloads
- fix tests associated with new GBIF polygon interpretation
- GitHub actions likely rtools build error HOT 1
- consider deprecating some out of date gbif_citation functions
- Deprecate the "axe" feature in occ_data
- Give user friendly warning about full downloads using occ_download()
- Add support for describe download formats
- add guidance for reversing WKT winding order HOT 1
- add constituentKey to name_lookup
- update occ_data documentation
- Add entry to taxonomic_names vignettes to obtain a more concise list of similar taxa HOT 1
- CRAN check fix HOT 1
- occ_download not returning all possible variables HOT 2
- lastInterpreted argument of `occ_search()` does not appear to be functioning HOT 3
- add verbatim_extensions as arg
- Continent interpretation
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rgbif.