Comments (12)
I do not think there is a need. tar_map()
accepts fully-formed target objects with pattern
s of their own, and static branching modifies those patterns as needed. Users should set pattern
beforehand on a target-by-target basis.
library(targets)
tar_script({
library(tarchetypes)
templates <- list(
tar_target(x, f(a)),
tar_target(y, f(x), pattern = map(x)),
tar_target(z, g(y), pattern = map(y))
)
targets <- tar_map(values = list(a = c(1, 2)), templates)
tar_pipeline(targets)
})
tar_manifest()
#> # A tibble: 6 x 3
#> name command pattern
#> <chr> <chr> <chr>
#> 1 z_1 g(y_1) map(y_1)
#> 2 z_2 g(y_2) map(y_2)
#> 3 y_1 f(x_1) map(x_1)
#> 4 y_2 f(x_2) map(x_2)
#> 5 x_1 f(1) <NA>
#> 6 x_2 f(2) <NA>
tar_visnetwork(targets_only = TRUE)
Created on 2020-12-29 by the reprex package (v0.3.0)
from tarchetypes.
Thanks, that is very clean. But here I have a case that the pattern of tar_map()
have to be dynamic.
Specifically, I have data from many psychological tests, each of which has to be preprocessed by a different function. This makes my pipeline be based on data and function simultaneously. Functions are not treated as something to be branched over dynamically in targets
, so I build static branches over all of these functions. However, the data have to be dynamic because they can not be fetched easily, and thus a target in the pipeline is required. And then the tar_map()
is preferably branched over this target, a dynamic one.
So I am looking forward this pattern supported in tar_map()
, although I can use tar_target_raw()
to build iteratively by myself. Maybe there are some other better methods?
from tarchetypes.
I suggest a different way to express the pipeline, either with static branching to split the data before preprocessing or dynamic branching to map over a list of functions. Sketch of the latter:
tar_pipeline(
tar_target(
dataset,
download_dataset() %>%
dplyr::group_by(...),
targets::tar_group(),
iteration = "group"
),
tar_target(
functions,
create_function_list_from_dataset(dataset),
iteration = "list"
),
tar_target(
preprocessed,
functions(dataset),
pattern = map(functions, dataset)
)
)
pattern
in tar_map()
would be incompatible with the design of targets
and tarchetypes
, and it would risk enabling suboptimal usage habits.
from tarchetypes.
Maybe the challenge here is to track the changes of functions. I have ever tried this method, and there is no way to make it work if there is a dynamic branch for functions.
from tarchetypes.
The function list should always rerun (quick) but only the affected branches should rerun after that (see evidence below). So I think the workaround could work.
library(targets)
tar_script({
options(crayon.enabled = FALSE)
f <- function() {
c(f = "f_current")
}
g <- function() {
c(g = "g_old")
}
tar_pipeline(
tar_target(
functions,
list(f = f, g = g),
iteration = "list"
),
tar_target(
output,
functions(),
pattern = map(functions)
)
)
})
tar_make()
#> * run target functions
#> * run branch output_e1755eb6
#> * run branch output_51320a50
tar_read(output)
#> f g
#> "f_current" "g_old"
tar_script({
options(crayon.enabled = FALSE)
f <- function() {
c(f = "f_current")
}
g <- function() {
c(g = "g_new")
}
tar_pipeline(
tar_target(
functions,
list(f = f, g = g),
iteration = "list"
),
tar_target(
output,
functions(),
pattern = map(functions)
)
)
})
tar_make()
#> * run target functions
#> v skip branch output_e1755eb6
#> * run branch output_6dbb0c9e
tar_read(output)
#> f g
#> "f_current" "g_new"
Created on 2020-12-30 by the reprex package (v0.3.0)
One note: functions have fickle internals that change the first few times they are called, which changes the hash. This was a problem in drake
when users returned functions from targets: ropensci/drake#345. Because of the improvements in targets
, I think this is unlikely to affect you (unless you call tar_make(callr_function = NULL)
and manually run the functions repeatedly in the same R session) but I have not spent much time exploring it personally.
from tarchetypes.
Thank you very much! I have learnt what you mean, maybe I need to refactor it a little. Later I will post my solutions in some time, which uses tar_target_raw()
instead.
from tarchetypes.
@wlandau Sorry to disturb you about this issue again. I just have no idea how to create the functions list from a configuration file which stores the function names as characters.
I have tried getFromNamespace()
or get()
, which worked but it will always rerun if the function is from a package and the package updates. When the function is in the global environment, there is nothing unexpected.
I also tried rlang::sym()
, but no way. A symbol is just different from a call.
from tarchetypes.
I just have no idea how to create the functions list from a configuration file which stores the function names as characters.
You can use tidy evaluation to insert symbols into a target's command. That should allow you to start from a character vector.
library(targets)
tar_script({
library(rlang)
library(targets)
fns <- c(a = "a", b = "b")
list(
tar_target(functions, list(!!!syms(fns)))
)
})
tar_manifest()
#> # A tibble: 1 x 3
#> name command pattern
#> <chr> <chr> <chr>
#> 1 functions list(a = a, b = b) <NA>
Created on 2021-03-14 by the reprex package (v1.0.0)
I have tried getFromNamespace() or get(), which worked but it will always rerun if the function is from a package and the package updates.
To track functions from a package and the inner nested functions they call, list the package name in the imports
option as described here.
from tarchetypes.
Thank you very much for your response, and sorry for late feedback. This method requires that fns
is an object not dynamically generated. So it is not so compatible with the previous suggestion of create_function_list_from_dataset()
.
I suggest a different way to express the pipeline, either with static branching to split the data before preprocessing or dynamic branching to map over a list of functions. Sketch of the latter:
tar_pipeline( tar_target( dataset, download_dataset() %>% dplyr::group_by(...), targets::tar_group(), iteration = "group" ), tar_target( functions, create_function_list_from_dataset(dataset), iteration = "list" ), tar_target( preprocessed, functions(dataset), pattern = map(functions, dataset) ) )
pattern
intar_map()
would be incompatible with the design oftargets
andtarchetypes
, and it would risk enabling suboptimal usage habits.
from tarchetypes.
If that's the case, then we are back to static branching over functions and dynamically branching over data, which is the better option anyway when it can be done.
library(targets)
tar_script({
options(tidyverse.quiet = TRUE)
library(rlang)
library(targets)
library(tarchetypes)
library(tidyverse)
values <- tribble(
~f, ~g, ~name,
"f1", "g1", "function_set1",
"f2", "g2", "function_set2",
) %>%
mutate(f = syms(f), g = syms(g))
tar_map(
values = values,
names = name,
tar_target(x, f()),
tar_target(y, g(x), pattern = map(x))
)
})
tar_manifest()
#> # A tibble: 4 x 3
#> name command pattern
#> <chr> <chr> <chr>
#> 1 x_function_set1 f1() <NA>
#> 2 x_function_set2 f2() <NA>
#> 3 y_function_set1 g1(x_function_set1) map(x_function_set1)
#> 4 y_function_set2 g2(x_function_set2) map(x_function_set2)
tar_visnetwork(targets_only = TRUE)
Created on 2021-03-17 by the reprex package (v1.0.0)
from tarchetypes.
Thank you for your quick response. I am just thinking about it the same way as your last post, so there should be some trade-off in this case. I will try using this method.
from tarchetypes.
The final version I used is as follows (original link), which is enough for use and very clear for me. Thank you very much for you help!
list(
...,
targets_indices <- tar_map(
values = tarflow.iquizoo::game_info %>%
group_by(prep_fun_name) %>%
summarise(game_ids = list(game_id), .groups = "drop") %>%
mutate(prep_fun = syms(prep_fun_name)),
names = prep_fun_name,
tar_target(data_prep, data %>% filter(game_id %in% game_ids)),
tar_target(indices, tarflow.iquizoo::calc_indices(data_prep, prep_fun))
),
tar_combine(game_indices, targets_indices[[2]], format = "fst_tbl")
)
from tarchetypes.
Related Issues (20)
- Local persistence of cloud-backed file targets HOT 5
- `tar_quarto()` always ends normally for quarto project even if there is error HOT 3
- `tar_quarto()` ignores `output_dir` in _quarto.yml when passed an individual file
- combine tar_cue_age with a conditional statement HOT 4
- Rep-specific seeds in tar_rep(), tar_map_rep(), etc. HOT 5
- optional garbage collection between reps of the `tar_rep*()` functions HOT 1
- tar_change repository not considered for change part
- Branches not in metadata: branches out of range
- GitHub interactions are temporarily limited because the maintainer is out of office.
- tar_cross() HOT 2
- Bug: `tar_quarto_rep()` throws an error if used together with `future::plan()` from _targets.R template HOT 1
- Support Quarto profiles? HOT 10
- Expose `tar_render()`, `tar_quarto()` and similar functions to the `deps` argument of `tar_target_raw()` HOT 8
- Errors and warnings with Quarto
- tar_quarto_rep doesn't work on reports in subdirectories HOT 2
- `retrieval = "none"` in quarto target factories HOT 2
- [general] Use `tar_rep()` and `tar_rep2()` inside of `tar_map()` HOT 2
- Allow trailing comma in `tar_map()` HOT 1
- Let `tar_map()` substitute more fields, e.g., `priority` HOT 2
- Safely allow tar_quarto() etc. to run the report from a custom working directory HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tarchetypes.