Comments (7)
Cool, one down. Just pushed up a fix for the second error. Go ahead and reinstall and try again. Should be good to go now.
I certainly agree on avoiding SQL, far less accessible than R in my opinion. I've gotten all of our 'out-of-memory' implementations implemented and running, just still tracking down any remaining bugs.
In short, when producing estimates for a population that covers multiple states, we can process data on a state-by-state basis and then combine results (which are much smaller than the original database) at the end. The new 'out-of-memory' methods for rFIA take advantage of this - we essentially read the necessary tables for each state into RAM one at a time, summarize to the estimation unit level (always sub-state and mutually exclusive populations, hence additive properties apply). We save the estimation unit level results for each state in RAM, and combine them into the final output once we've iterated over all states. So basically we just chunk up the data and allow rFIA to produce estimates from larger than RAM datasets by reading and processing those chunks one at a time.
Here is an example if you want to give it a try. Theoretically you should be able to produce estimates from the entire FIA database on a standard desktop (testing now, stay tuned):
## Download data for two small states
getFIA(c('RI', 'CT'), dir = 'path/to/save/', load = FALSE)
## Now set up a Remote.FIA.Database with readFIA
## by setting inMemory = FALSE
## Instead of reading in the data now, we just save a pointer
## and allow the estimator functions to read/process the data
## state-by-state
fia <- readFIA('path/to/save/', inMemory = FALSE)
summary(fia)
## clipFIA methods still work on the remote objects
## The clip will be performed in memory, but on each
## chunk of data read in by an estimator function
fiaMR <- clipFIA(fia)
summary(fiaMR)
## Get some seedling estimates - no change to syntax here
## All years
seedling(fia)
## most recent
seedling(fiaMR)
## plot-level
seedling(fia, byPlot = TRUE)
All estimates produced from the Remote.FIA.Database
will be (should be) identical to those produced from the regular in-memory FIA.Database
, however the out-of-memory methods will be slightly slower because the reading happens within the estimator function (e.g., seedling
). The drawback is that it is more difficult to modify columns in the tables when using a Remote.FIA.Database
. For example, the following would break if fia
is a Remote.FIA.Database
, but would work if fia
were an in-memory FIA.Database
:
## Using Remote.FIA.Database from above
## Will break under rFIA v 0.2.4 when fia is a Remote.FIA.Database
fia$COND$STAND_AGE <- makeClasses(fia$COND$STDAGE, interval = 10)
from rfia.
Which version of rFIA are generating the error with? Using versions 0.2.2+ I get the following:
> seedling(fiaRI, bySpecies = TRUE, byPlot = TRUE)
# A tibble: 193 x 10
PLT_CN YEAR pltID SPCD COMMON_NAME SCIENTIFIC_NAME PLOT_STATUS_CD TPA TPA_PERC nStems
<dbl> <int> <chr> <int> <chr> <chr> <int> <dbl> <dbl> <int>
1 1.45e14 2009 1_44_7_~ 541 white ash Fraxinus america~ 1 75.0 100 1
2 1.45e14 2009 1_44_7_~ 701 eastern hophorn~ Ostrya virginiana 1 75.0 100 1
3 1.45e14 2009 1_44_7_~ 129 eastern white p~ Pinus strobus 1 75.0 100 1
4 1.45e14 2009 1_44_7_~ 316 red maple Acer rubrum 1 75.0 100 1
5 1.45e14 2009 1_44_9_~ 129 eastern white p~ Pinus strobus 1 75.0 100 1
6 1.45e14 2009 1_44_9_~ 541 white ash Fraxinus america~ 1 75.0 100 1
7 1.45e14 2009 1_44_9_~ 129 eastern white p~ Pinus strobus 1 8021. 100 107
8 1.69e14 2010 1_44_3_~ 68 eastern redcedar Juniperus virgin~ 1 225. 100 3
9 1.69e14 2010 1_44_3_~ 129 eastern white p~ Pinus strobus 1 1124. 100 15
10 1.69e14 2010 1_44_3_~ 129 eastern white p~ Pinus strobus 1 750. 100 10
# ... with 183 more rows
Could you update rFIA from GitHub with devtools::install_github('hunter-stanke/rFIA')
and try it again (this will get you the development version, currently 0.2.3)? If the GitHub install gives you trouble, version 0.2.2 is available on CRAN: install.packages('rFIA')
from rfia.
That was it. I was using version 0.2.0. I updated the package to 0.2.2 and it works fine now.
from rfia.
Perfect, glad you're up and running!
from rfia.
I'm encountering a similar(?) problem with the seedling function, using version 0.2.3
seedling(fiaRI, byPlot = TRUE)
Error in checkForRemoteErrors(val) :
one node produced an error: distinct()
must use existing variables.
x TREE
not found in .data
.
I tried it for a few other states as well with same error.
from rfia.
Great catch, just took care of the bug.
Go ahead and re-install with devtools::install_github('hunter-stanke/rFIA')
. You'll be looking for version 0.2.4.
Please let me know if you find anything else. The dev version of rFIA
(0.2.4) is a major overhaul from the current CRAN version, intended to provide more memory efficient implementations of our estimator functions. No breaking changes for users, but a lot of modifications underneath. Still working to identify and fix any lingering bugs before our next CRAN push.
from rfia.
Thanks! That solved the error with byPlot = TRUE. I attempted grouping by SUBP and SPGRPCD and got a different error
seedling(fiaRI, byPlot = T, bySpecies = T, grpBy = SPGRPCD)
Error in FUN(X[[i]], ...) :
Columns SPGRPCD not found in PLOT, TREE, or COND tables. Did you accidentally quote the variables names? e.g. use grpBy = ECOSUBCD (correct) instead of grpBy = "ECOSUBCD"
(note, it works for ECOSUBCD and a couple other columns I tested from the COND and PLOT tables).
I've been working extensively with rFIA over the past month and it has improved my understanding of the FIA data much more rapidly than fumbling around with unfamiliar SQL queries. I've been making some attempts make estimates across species ranges, so am very interested to find out more about memory efficient implementations. I've been intending to reach out and ask about limitations on the size of the FIADB that's read into memory, so theoretically will get that email out to you soon.
from rfia.
Related Issues (20)
- growMort function not recognizing all treeTypes HOT 1
- mergeSmallStrata(): 'buf' not 'buff' to id intensified PNW strata HOT 1
- readFIA issues HOT 2
- Issue with getFIA - url connects to 404 page HOT 6
- Difficulty using huc watersheds as polygons HOT 2
- Summarizing by subplot gives wrong values
- rFIA::tpa() removes trees with no measured DIA
- FIA databases appear to be inaccessible HOT 2
- Error in `dplyr::select()`: Column `geom` doesn't exist. HOT 1
- 'EVALID' not found error using 'readFIA' from file & using 'getFIA' for multiple states HOT 7
- df issue in udTreeDomain() within biomass()
- getFIA URL error HOT 2
- growMort bySizeClass missing removed stems
- Failure with dev dtplyr
- fa_mean doesn't exist HOT 5
- 'invasive' function not finding INVASIVE_SUBPLOT_SPP table HOT 1
- Newest version of R not compatible
- seemingly strange behavior from customPSE()
- 'DRYBIO_WDLD_SPP' and Column `CARBON_STANDING_DEAD` not found HOT 4
- Error when downloading tables for several states HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rfia.