Giter Site home page Giter Site logo

Comments (10)

medewitt avatar medewitt commented on September 26, 2024 1

Tossing this out there. With R >=4 (I know, not again), you can access the user cache directory in a CRAN compliant fashion. In the old days you could use the rappdirs package to do this (as is the approach in tigris where you can set your options e.g. options(tigris_use_cache= TRUE) and it will cache the appropriate data. I used a similar approach in toy package here when I was trying to figure out how to make Stan code easier to move around.

I'd have to do more research on instantiate to say anything intelligent on it.

from epinowcast.

seabbs avatar seabbs commented on September 26, 2024 1

Ah okay. I had really not understood this I think when you showed it to me before. Could you scaffold out some quick toy code here to show how this alternative would work and give an upside/downsides you see?

from epinowcast.

medewitt avatar medewitt commented on September 26, 2024 1

Using the rappdirs front (and adding it as a dep). The code could look like the following:

# Allow users to set the cache location
options(enw_cache=TRUE)

enw_model <- function(model = system.file(
                        "stan", "epinowcast.stan",
                        package = "epinowcast"
                      ),
                      include = system.file("stan", package = "epinowcast"),
                      compile = TRUE, threads = TRUE, profile = FALSE,
                      target_dir = getOption("enw_cache"), stanc_options = list(),
                      cpp_options = list(), verbose = TRUE, ...) {

    if (isTRUE(target_dir) && is.logical(target_dir)) {
        target_dir <- rappdirs::user_cache_dir(appname = "epinowcast")
		cli::cli_alert("Stan models cached at: `{target_dir}`.")
    } else if ((!isTRUE(target_dir) && is.logical(target_dir) || is.null(target_dir))) {
        target_dir <- tempdir()
    } else if (!dir.exists(target_dir)) {
        cli::cli_alert("{target_dir} does not exist; creating it.")
        # Maybe a trycatch here
        dir.create(target_dir)
    } else if (dir.exists(target_dir)){
        target_dir <- target_dir
    } else {
        target_dir <- tempdir()
    } 

    return(target_dir)

}

# Then a helper function
.onAttach <- function(libname, pkgname) {
  packageStartupMessage("To enable Stan model caching using `options(enw_cache = TRUE)` in an Rscript or .Rprofile.")
}

# Run with caching
enw_model()
#> → Stan models cached at: `~/Library/Caches/epinowcast`.
#> [1] "~/Library/Caches/epinowcast"

# Turn it off (also works for not ever setting it...test not shown
options(enw_cache = FALSE)
enw_model()
#> [1] "/var/folders/0x/6bnjy4n15kz8nbfbbwbyk3pr0000gn/T//RtmpjVfdmO"

options(enw_cache = "scratch")
enw_model()
#> z"scratch"

The only drawback is that CmdStan will recompile the code when the "modified date" properties on the files change. This means that on each write_stan_files_no_profile these properties will change. Will think on this more, but at least that would be the sketch of caching using this approach.

from epinowcast.

seabbs avatar seabbs commented on September 26, 2024 1

Okay this is very very cool and seems like the best of both worlds to me?

will recompile the code when the "modified date"

And we can't get around this by just saving the no profile model (or the profile one) to a different temporary name? I'm not actually sure how that would work though having written it.

from epinowcast.

medewitt avatar medewitt commented on September 26, 2024 1

And we can't get around this by just saving the no profile model (or the profile one) to a different temporary name? I'm not actually sure how that would work though having written it.

I just dug a little further, and yes, if we have the files written out somewhere initially, we can copy them over and preserve the date into the user cache so that it only need by compiled once (or on actual change). If the user updates the package it will/should wipe the cache.

I repeated the following code and it only needed to be compiled once as a POC.

file.copy("bleh.stan", "scratch/bleh.stan", overwrite = TRUE, copy.date = TRUE)

We might have a path forward....

from epinowcast.

seabbs avatar seabbs commented on September 26, 2024 1
  1. Americans get up too early
  2. Amazing!

If the user updates the package it will/should wipe the cache.

Agree. This would mean it would be a pain to work with multiple versions of epinowcast on a system but I don't really think that edge case is going to be a real issue for most users and can be pinned until/if its flagged.

from epinowcast.

seabbs avatar seabbs commented on September 26, 2024 1

It feels a bit tricky having the argument be logical or a character string?

Maybe we instead want to change to an entirely environment variable approach as tigris does and tell the uses on attach what to run to get this to use the user cache?

https://github.com/walkerke/tigris/blob/0cc826af79cc4afb318a87330979b9890e8a7b0f/R/helpers.R#L20

So something like:

enw_set_cache(rappdirs::user_cache_dir(appname = "epinowcast"))

enw_model()

and we would want to pair it with:

enw_get_cache <- function() {
 if (environ var is present) {
   cache <- read environ
 } else {
  cache <- temp directory
}
 return(cache)
}

to be used in enw_model and maybe by the user?

from epinowcast.

medewitt avatar medewitt commented on September 26, 2024 1

@seabbs started a branch to test this approach and self-assigned, if that's ok with you

from epinowcast.

seabbs avatar seabbs commented on September 26, 2024

That is very okay with me. Excited to see this.

from epinowcast.

medewitt avatar medewitt commented on September 26, 2024

It seems to work. The user can now:

  • set the cache using the enw_set_cache function. If nothing is supplied it will default to the tempdir.
  • unset the cache using enw_unset_cache. This removes the location from the environment, but not from disk, so if they reset the cache to a location that once was a cache, it will still exist.
  • a generally internal, but user exposed, enw_get_cache allowing the user to see where the cache is (which is written to the .Renviron file). The user could also use the Sys.setenv function if desired (i.e., they don't want to add anything to their dotfiles).

I ran the branch in a two different sessions with very good results. One minute compile time in the first session, but when recalling the cached models in the second session I didn't have to wait for compilation.

It turns out cmdstanr had some pretty sophisicated file writing built in, so I just leveraged that functionality.

Initial R Session
#' Load the package contents
devtools::load_all()
## ℹ Loading epinowcast
Sys.time()
## [1] "2023-12-14 15:26:07 EST"
#' Inspect the cache
enw_get_cache()
## Using `/Users/michael/Library/Caches/epinowcastw` for the cache location.

## [1] "/Users/michael/Library/Caches/epinowcastw"
enw_unset_cache()
enw_get_cache()
## [1] "/var/folders/0x/6bnjy4n15kz8nbfbbwbyk3pr0000gn/T//Rtmpgw6v1y"
#' Set the cache and confirm
enw_set_cache(rappdirs::user_cache_dir("epinowcast"))
enw_get_cache()
## Using `/Users/michael/Library/Caches/epinowcast` for the cache location.

## [1] "/Users/michael/Library/Caches/epinowcast"
compile1 <- system.time(enw_model())
## Using model /Users/michael/wfu-id/epinowcast/inst/stan/epinowcast.stan.

## include is /Users/michael/wfu-id/epinowcast/inst/stan.

## Using `/Users/michael/Library/Caches/epinowcast` for the cache location.
## Compiling Stan program...

## Warning in readLines(hpp_path): incomplete final line found on
## '/var/folders/0x/6bnjy4n15kz8nbfbbwbyk3pr0000gn/T//Rtmpgw6v1y/model-9abc52b29753.hpp'

## -\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-

## In file included from /var/folders/0x/6bnjy4n15kz8nbfbbwbyk3pr0000gn/T/Rtmpgw6v1y/model-9abc52b29753.hpp:1:
## In file included from /Users/michael/.cmdstan/cmdstan-2.33.1/stan/src/stan/model/model_header.hpp:4:
## In file included from /Users/michael/.cmdstan/cmdstan-2.33.1/stan/lib/stan_math/stan/math.hpp:19:
## In file included from /Users/michael/.cmdstan/cmdstan-2.33.1/stan/lib/stan_math/stan/math/rev.hpp:10:
## In file included from /Users/michael/.cmdstan/cmdstan-2.33.1/stan/lib/stan_math/stan/math/rev/fun.hpp:200:
## In file included from /Users/michael/.cmdstan/cmdstan-2.33.1/stan/lib/stan_math/stan/math/prim/functor.hpp:16:
## In file included from /Users/michael/.cmdstan/cmdstan-2.33.1/stan/lib/stan_math/stan/math/prim/functor/integrate_ode_rk45.hpp:6:
## In file included from /Users/michael/.cmdstan/cmdstan-2.33.1/stan/lib/stan_math/stan/math/prim/functor/ode_rk45.hpp:9:
## In file included from /Users/michael/.cmdstan/cmdstan-2.33.1/stan/lib/stan_math/lib/boost_1.78.0/boost/numeric/odeint.hpp:76:
## In file included from /Users/michael/.cmdstan/cmdstan-2.33.1/stan/lib/stan_math/lib/boost_1.78.0/boost/numeric/odeint/integrate/observer_collection.hpp:23:
## In file included from /Users/michael/.cmdstan/cmdstan-2.33.1/stan/lib/stan_math/lib/boost_1.78.0/boost/function.hpp:30:
## In file included from /Users/michael/.cmdstan/cmdstan-2.33.1/stan/lib/stan_math/lib/boost_1.78.0/boost/function/detail/prologue.hpp:17:
## In file included from /Users/michael/.cmdstan/cmdstan-2.33.1/stan/lib/stan_math/lib/boost_1.78.0/boost/function/function_base.hpp:21:
## In file included from /Users/michael/.cmdstan/cmdstan-2.33.1/stan/lib/stan_math/lib/boost_1.78.0/boost/type_index.hpp:29:
## In file included from /Users/michael/.cmdstan/cmdstan-2.33.1/stan/lib/stan_math/lib/boost_1.78.0/boost/type_index/stl_type_index.hpp:47:
## /Users/michael/.cmdstan/cmdstan-2.33.1/stan/lib/stan_math/lib/boost_1.78.0/boost/container_hash/hash.hpp:132:33: warning: 'unary_function<const std::error_category *, unsigned long>' is deprecat

## \

## ed [-Wdeprecated-declarations]

## |

##   132 |         struct hash_base : std::unary_function<T, std::size_t> {};
##       |                                 ^
## /Users/michael/.cmdstan/cmdstan-2.33.1/stan/lib/stan_math/lib/boost_1.78.0/boost/container_hash/hash.hpp:692:18: note: in instantiation of template class 'boost::hash_detail::hash_base<const std::error_category *>' requested here
##   692 |         : public boost::hash_detail::hash_base<T*>
##       |                  ^
## /Users/michael/.cmdstan/cmdstan-2.33.1/stan/lib/stan_math/lib/boost_1.78.0/boost/container_hash/hash.hpp:420:24: note: in instantiation of template class 'boost::hash<const std::error_category *>' requested here
##   420 |         boost::hash<T> hasher;
##       |                        ^
## /Users/michael/.cmdstan/cmdstan-2.33.1/stan/lib/stan_math/lib/boost_1.78.0/boost/container_hash/hash.hpp:551:9: note: in instantiation of function template specialization 'boost::hash_combine<const std::error_category *>' requested here
##   551 |         hash_combine(seed, &v.category());
##       |         ^

## /

## /usr/local/Cellar/llvm/17.0.4/bin/../include/c++/v1/__functional/unary_function.h:23:29: note: 'unary_function<const std::error_category *, unsigned long>' has been explicitly marked deprecated here
##    23 | struct _LIBCPP_TEMPLATE_VIS _LIBCPP_DEPRECATED_IN_CXX11 unary_function
##       |                             ^

## -

## /usr/local/Cellar/llvm/17.0.4/bin/../include/c++/v1/__config:971:41: note: expanded from macro '_LIBCPP_DEPRECATED_IN_CXX11'
##   971 | #    define _LIBCPP_DEPRECATED_IN_CXX11 _LIBCPP_DEPRECATED
##       |                                         ^
## /usr/local/Cellar/llvm/17.0.4/bin/../include/c++/v1/__config:956:49: note: expanded from macro '_LIBCPP_DEPRECATED'
##   956 | #      define _LIBCPP_DEPRECATED __attribute__((__deprecated__))
##       |                                                 ^

## \|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/

## 1 warning generated.

## -\|/-\ 

## Warning in readLines(private$hpp_file_): incomplete final line found on
## '/var/folders/0x/6bnjy4n15kz8nbfbbwbyk3pr0000gn/T//Rtmpgw6v1y/model-9abc52b29753.hpp'
compile1
##    user  system elapsed 
##  50.383   2.346  57.178
compile2 <- system.time(enw_model())
## Using model /Users/michael/wfu-id/epinowcast/inst/stan/epinowcast.stan.

## include is /Users/michael/wfu-id/epinowcast/inst/stan.

## Using `/Users/michael/Library/Caches/epinowcast` for the cache location.
## Model executable is up to date!
compile2
##    user  system elapsed 
##   0.052   0.017   0.085
Second R session (completely new session)
#' Load the package contents
devtools::load_all()
## ℹ Loading epinowcast
Sys.time()
## [1] "2023-12-14 15:27:21 EST"
#' Inspect the cache
enw_get_cache()
## Using `/Users/michael/Library/Caches/epinowcast` for the cache location.

## [1] "/Users/michael/Library/Caches/epinowcast"
enw_unset_cache()
enw_get_cache()
## [1] "/var/folders/0x/6bnjy4n15kz8nbfbbwbyk3pr0000gn/T//Rtmpgw6v1y"
#' Set the cache and confirm
enw_set_cache(rappdirs::user_cache_dir("epinowcast"))
enw_get_cache()
## Using `/Users/michael/Library/Caches/epinowcast` for the cache location.

## [1] "/Users/michael/Library/Caches/epinowcast"
compile1 <- system.time(enw_model())
## Using model /Users/michael/wfu-id/epinowcast/inst/stan/epinowcast.stan.

## include is /Users/michael/wfu-id/epinowcast/inst/stan.

## Using `/Users/michael/Library/Caches/epinowcast` for the cache location.
## Model executable is up to date!
compile1
##    user  system elapsed 
##   0.043   0.020   0.094
compile2 <- system.time(enw_model())
## Using model /Users/michael/wfu-id/epinowcast/inst/stan/epinowcast.stan.

## include is /Users/michael/wfu-id/epinowcast/inst/stan.

## Using `/Users/michael/Library/Caches/epinowcast` for the cache location.
## Model executable is up to date!
compile2
##    user  system elapsed 
##   0.045   0.011   0.069

from epinowcast.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.