Giter Site home page Giter Site logo

seedling function about rfia HOT 7 CLOSED

hunter-stanke avatar hunter-stanke commented on August 14, 2024
seedling function

from rfia.

Comments (7)

hunter-stanke avatar hunter-stanke commented on August 14, 2024 2

Cool, one down. Just pushed up a fix for the second error. Go ahead and reinstall and try again. Should be good to go now.

I certainly agree on avoiding SQL, far less accessible than R in my opinion. I've gotten all of our 'out-of-memory' implementations implemented and running, just still tracking down any remaining bugs.

In short, when producing estimates for a population that covers multiple states, we can process data on a state-by-state basis and then combine results (which are much smaller than the original database) at the end. The new 'out-of-memory' methods for rFIA take advantage of this - we essentially read the necessary tables for each state into RAM one at a time, summarize to the estimation unit level (always sub-state and mutually exclusive populations, hence additive properties apply). We save the estimation unit level results for each state in RAM, and combine them into the final output once we've iterated over all states. So basically we just chunk up the data and allow rFIA to produce estimates from larger than RAM datasets by reading and processing those chunks one at a time.

Here is an example if you want to give it a try. Theoretically you should be able to produce estimates from the entire FIA database on a standard desktop (testing now, stay tuned):

## Download data for two small states
getFIA(c('RI', 'CT'), dir = 'path/to/save/', load = FALSE)

## Now set up a Remote.FIA.Database with readFIA
## by setting inMemory = FALSE
## Instead of reading in the data now, we just save a pointer
## and allow the estimator functions to read/process the data
## state-by-state
fia <- readFIA('path/to/save/', inMemory = FALSE)

summary(fia)

## clipFIA methods still work on the remote objects
## The clip will be performed in memory, but on each 
## chunk of data read in by an estimator function
fiaMR <- clipFIA(fia)
summary(fiaMR)

## Get some seedling estimates - no change to syntax here
## All years
seedling(fia)

## most recent
seedling(fiaMR)

## plot-level
seedling(fia, byPlot = TRUE)

All estimates produced from the Remote.FIA.Database will be (should be) identical to those produced from the regular in-memory FIA.Database, however the out-of-memory methods will be slightly slower because the reading happens within the estimator function (e.g., seedling). The drawback is that it is more difficult to modify columns in the tables when using a Remote.FIA.Database . For example, the following would break if fia is a Remote.FIA.Database, but would work if fia were an in-memory FIA.Database :

## Using Remote.FIA.Database from above
## Will break under rFIA v 0.2.4 when fia is a Remote.FIA.Database
fia$COND$STAND_AGE <- makeClasses(fia$COND$STDAGE, interval = 10)

from rfia.

hunter-stanke avatar hunter-stanke commented on August 14, 2024

Which version of rFIA are generating the error with? Using versions 0.2.2+ I get the following:

> seedling(fiaRI, bySpecies = TRUE, byPlot = TRUE)
# A tibble: 193 x 10
    PLT_CN  YEAR pltID     SPCD COMMON_NAME      SCIENTIFIC_NAME   PLOT_STATUS_CD    TPA TPA_PERC nStems
     <dbl> <int> <chr>    <int> <chr>            <chr>                      <int>  <dbl>    <dbl>  <int>
 1 1.45e14  2009 1_44_7_~   541 white ash        Fraxinus america~              1   75.0      100      1
 2 1.45e14  2009 1_44_7_~   701 eastern hophorn~ Ostrya virginiana              1   75.0      100      1
 3 1.45e14  2009 1_44_7_~   129 eastern white p~ Pinus strobus                  1   75.0      100      1
 4 1.45e14  2009 1_44_7_~   316 red maple        Acer rubrum                    1   75.0      100      1
 5 1.45e14  2009 1_44_9_~   129 eastern white p~ Pinus strobus                  1   75.0      100      1
 6 1.45e14  2009 1_44_9_~   541 white ash        Fraxinus america~              1   75.0      100      1
 7 1.45e14  2009 1_44_9_~   129 eastern white p~ Pinus strobus                  1 8021.       100    107
 8 1.69e14  2010 1_44_3_~    68 eastern redcedar Juniperus virgin~              1  225.       100      3
 9 1.69e14  2010 1_44_3_~   129 eastern white p~ Pinus strobus                  1 1124.       100     15
10 1.69e14  2010 1_44_3_~   129 eastern white p~ Pinus strobus                  1  750.       100     10
# ... with 183 more rows

Could you update rFIA from GitHub with devtools::install_github('hunter-stanke/rFIA') and try it again (this will get you the development version, currently 0.2.3)? If the GitHub install gives you trouble, version 0.2.2 is available on CRAN: install.packages('rFIA')

from rfia.

djj4tree avatar djj4tree commented on August 14, 2024

That was it. I was using version 0.2.0. I updated the package to 0.2.2 and it works fine now.

from rfia.

hunter-stanke avatar hunter-stanke commented on August 14, 2024

Perfect, glad you're up and running!

from rfia.

whalend avatar whalend commented on August 14, 2024

I'm encountering a similar(?) problem with the seedling function, using version 0.2.3

seedling(fiaRI, byPlot = TRUE)

Error in checkForRemoteErrors(val) :
one node produced an error: distinct() must use existing variables.
x TREE not found in .data.

I tried it for a few other states as well with same error.

from rfia.

hunter-stanke avatar hunter-stanke commented on August 14, 2024

Great catch, just took care of the bug.

Go ahead and re-install with devtools::install_github('hunter-stanke/rFIA'). You'll be looking for version 0.2.4.

Please let me know if you find anything else. The dev version of rFIA (0.2.4) is a major overhaul from the current CRAN version, intended to provide more memory efficient implementations of our estimator functions. No breaking changes for users, but a lot of modifications underneath. Still working to identify and fix any lingering bugs before our next CRAN push.

from rfia.

whalend avatar whalend commented on August 14, 2024

Thanks! That solved the error with byPlot = TRUE. I attempted grouping by SUBP and SPGRPCD and got a different error

seedling(fiaRI, byPlot = T, bySpecies = T, grpBy = SPGRPCD)

Error in FUN(X[[i]], ...) :
Columns SPGRPCD not found in PLOT, TREE, or COND tables. Did you accidentally quote the variables names? e.g. use grpBy = ECOSUBCD (correct) instead of grpBy = "ECOSUBCD"

(note, it works for ECOSUBCD and a couple other columns I tested from the COND and PLOT tables).

I've been working extensively with rFIA over the past month and it has improved my understanding of the FIA data much more rapidly than fumbling around with unfamiliar SQL queries. I've been making some attempts make estimates across species ranges, so am very interested to find out more about memory efficient implementations. I've been intending to reach out and ask about limitations on the size of the FIADB that's read into memory, so theoretically will get that email out to you soon.

from rfia.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.