Comments (18)
To render the individual documents, you could use static branching with tar_eval()
to create a tar_quarto()
target for each qmd
file. Something like:
# _targets.R
# ...
list(
tar_target(...),
tar_eval(
tar_quarto(name, report),
values = list(name = rlang::syms(c("report1", "report2")), report = c("report1.qmd", "report2.qmd"))
)
)
For projects, tar_quarto()
runs a single quarto::quarto_render()
call for the entire project, and Quarto may do more than simply render qmd
and Rmd
files. Any extra steps done by Quarto, such as rendering of bibliographies and Jupyter notebooks in addition to Rmd
files, are the responsibility of that single quarto_render()
call, and tarchetypes
would not be able to replicate them reliably.
In most cases, I don't think it is too much of a burden to render all the documents together. It is best if the pipeline does long-running tasks in upstream targets rather than the code chunks in the qmd files, and this modularity should allow tar_quarto()
targets to run quickly even if no reports are skipped.
Is the better approach to do what @tjmahr does with notestar or @jdtrat #23 (comment) and generate a list of .qmd files that gets fed to tar_target(..., file = TRUE)?
Because of the flexible project functionality in Quarto, I was hoping tar_quarto()
would avoid the need for low-level workarounds that require users to manually call tar_knitr_deps()
. Those workarounds are still possible, but I don't think they are as necessary unless you have a use case where rendering the reports in a project is unavoidably time-consuming.
from tarchetypes.
I think we agree the immediate advantage of tar_knitr_deps_expr()
is to help advanced users write their own dependency-aware R Markdown targets, whether they get rendered through bookdown::render_book()
, rmarkdown::render_site()
, or something else. That should buy us some time.
However, looking long term, I do not think this is the final answer. Most users get confused when I talk about static code analysis and dependency relationships, and I do not expect them to understand the issue you raised. For the use cases that become common enough, potentially bookdown
and rmarkdown
sites, we can write some new user-friendly target factories (e.g. tar_bookdown()
, like tar_render()
for bookdown projects). This approach could lead to a lot of new functions and ultimately one or more new targetopia packages for literate programming (tarmarkdown
?). If that happens, we could move the tarchetypes
literate programming stuff there and re-export it back for compatibility.
My head is spinning with targetopia ideas, I should start writing them down somewhere public.
@jdblischak, do you think workflowr
could benefit from its own tar_render()
-like target factory?
from tarchetypes.
Building some sort of pipeline directly into wflow_build()
and wflow_publish()
does sound handy, though this seems to have a slightly different set of goals and opinions than targets
has. For targets
, an R Markdown report is a document that depends on upstream targets and renders quickly. The intent is to summarize the hard work already done inside the targets themselves and do very little work in the actual report, so targets
does not really think about R Markdown reports that depend on each other (1-data.Rmd
, 2-model.Rmd
, etc.). I was thinking that if any or all of these documents happen to use targets
, there could be some way to capture that pattern and make it easier, but potentially not include steps that commit or publish downstream.
from tarchetypes.
The code !!tar_knitr_deps_expr(rmd_file)
is evaluated when the target is defined, not when the target runs. So as you found out, the pipeline needs !!tar_knitr_deps_expr("report.Rmd")
instead.
there are probably cases where it would be useful to not have to do that.
This is a permanent limitation: targets
needs to know the dependency structure in advance so it can run the correct targets in the correct order and invalidate the correct ones when dependencies change. I feel this is a reasonable constraint because most users write their R Markdown source by hand before running it.
from tarchetypes.
Possibly related to the discussion around workflowr
is the new tar_github_actions()
function in dev, which writes a workflow file to run a targets
pipeline on GitHub Actions and upload historical runs to a special targets-runs
branch. Demonstrated at https://github.com/wlandau/targets-minimal. Maybe this creates new opportunities?
from tarchetypes.
Hiya @wlandau! I'm getting into some serious writing for my master's thesis. The analyses were done with targets (again, thank you so much for such an incredible package!) and I wanted to incorporate the pipeline with bookdown. I adapted tarchetypes::tar_render()
into tar_render_book()
that follows a similar form. It seems to work pretty well for my purposes so I thought I'd see if there was any interest in making it more robust/supported?
I have an idea of the limitations of my implementation (and think @tjmahr's notestar package does an excellent job at addressing many of them!). I made a demo project at jdtrat/tar-render-book-demo with the adaptations I made. I'd love to contribute to the targetopia (after my thesis is submitted😅) and would value any feedback!
from tarchetypes.
Great work, @jdtrat! At a glance it looks promising. Really happy to see that someone took this on. Would be happy to loop back when you are ready to package up what you have. Good luck with your thesis!
from tarchetypes.
Hello,
I am also using render_site()
and I am using Garrick trick here: rstudio/gt#297 (comment)
# callr by gabenbuie https://github.com/rstudio/gt/issues/297#issuecomment-497778735
callr::r(function(...) rmarkdown::render_site(...),
args = list(input = Rmd,
output_format = "xaringan::moon_reader",
quiet = FALSE)
)
hope this helps
from tarchetypes.
Good idea, @tjmahr. Now implemented.
library(tarchetypes)
lines1 <- c(
"---",
"title: report",
"output_format: html_document",
"---",
"",
"```{r}",
"tar_load(data1)",
"tar_read(data2)",
"```"
)
lines2 <- c(
"---",
"title: report",
"output_format: html_document",
"---",
"",
"```{r}",
"tar_load(data2)",
"tar_read(data3)",
"```"
)
report1 <- tempfile()
report2 <- tempfile()
writeLines(lines1, report1)
writeLines(lines2, report2)
tar_knitr_deps(c(report1, report2))
#> [1] "data1" "data2" "data3"
tar_knitr_deps_expr(c(report1, report2))
#> list(data1, data2, data3)
Created on 2020-12-18 by the reprex package (v0.3.0)
from tarchetypes.
To be clear, I don’t think there is a huge rush to implement tarmarkdown right away. There is plenty of space in tarchetypes in the short and medium term to act as a proving ground. And depending on the use cases we find, it may or may not make sense to bud off a new package once we have enough concrete material.
from tarchetypes.
do you think workflowr could benefit from its own tar_render()-like target factory?
@wlandau I am definitely open to the possibility of integrating targets into workflowr. Support for pipelines based on file dependencies has been one of the most requested features (see discussion in workflowr/workflowr#9), but I have be hesitant to implement it since this is a huge feature. Using a dedicated pipeline tool like targets enable me to focus only on the workflowr-specific aspects.
A big issue is that workflowr recognizes 2 different kinds of "outdated" files. The first, wflow_build()
, is based on file modifications (similar to Make and targets), and it used when developing an analysis. The second, wflow_publish()
, is for when the analysis is ready to be committed and then added to the website. This is based on the git status
of the files, not the file modification time. Thus any solution for workflowr would have to distinguish between these two types of tracking the files.
I need to better investigate the current features provided by targets. I'd appreciate any advice on which features to focus on.
from tarchetypes.
this seems to have a slightly different set of goals and opinions than targets has.
Agreed. The framework for organizing the computation is different. There's no need to force the integration. If it's possible to combine workflowr/targets, that'd be nice. But at minimum, by investigating a potential integration, I'll learn from your implementation.
from tarchetypes.
I am trying this out in practice, but I am not sure if I am missing something. The example doesn't show how one would use the new functions in a tar_target()
. I am trying
list(
...,
tar_target(notebook_deps, tar_knitr_deps_expr(notebook_rmd_files)),
tar_target(
notebook,
command = {
deps = notebook_deps
other_deps <- list(
notebook_bookdown_yaml_file,
notebook_output_yaml_file
)
rmarkdown::render_site("notebook", encoding = 'UTF-8')
"notebook/docs/notebook.html"
},
format = "file"
)
but tar_visnetwork()
is not connecting any targets to the files.
tar_knitr_deps_expr(notebook_rmd_files)
is returning a list of symbols of target names, so that part is working.
Am I missing something about the evaluation?
from tarchetypes.
Looks almost right, but I would check your commands with tar_manifest()
to be sure. An easier way to do the metaprogramming is with tidy evaluation
tar_target(
notebook,
command = {
!!tar_knitr_deps_expr(notebook_rmd_files) # note the "!!"
other_deps <- list(
notebook_bookdown_yaml_file,
notebook_output_yaml_file
)
rmarkdown::render_site("notebook", encoding = 'UTF-8')
"notebook/docs/notebook.html"
},
format = "file"
)
But taking a step back, your workflow is file-based, so you are left having to fight the variable-based behavior that I designed targets
for. So you will probably have an easier time taking manual control of the target's dependencies. You can do this with the deps
argument of tar_target_raw()
. (Although you will have to update targets
because I just had to fix a bug to get this to work.)
tar_target_raw(
"notebook",
command = quote({
rmarkdown::render_site("notebook", encoding = 'UTF-8')
"notebook/docs/notebook.html"
}),
format = "file",
deps = tar_knitr_deps(notebook_rmd_files) # Returns a character vector of dependency names.
)
from tarchetypes.
To give you a better idea:
library(targets)
tar_script({
x <- 1
tar_target_raw("y", quote(1))
})
tar_visnetwork()
tar_script({
x <- 1
tar_target_raw("y", quote(1), deps = "x")
})
tar_visnetwork()
Created on 2021-01-07 by the reprex package (v0.3.0)
from tarchetypes.
Hi
I have been struggling to get this to work with rmarkdown::render_site()
for a single file but I keep getting an object not found
error. I think there might need to be some extra quoting or something but can't quite work it work out. Any help would be great!
Here is what I have been trying:
list(
tar_target(
rmd_file,
"report.Rmd",
format = "file"
),
tar_target(
html_file,
command = {
!! tar_knitr_deps_expr(rmd_file)
rmarkdown::render_site(rmd_file)
"report.html"
},
format = "file"
)
)
> targets::tar_visnetwork()
Error in file.exists(path) : object 'rmd_file' not found
If you manually give the path to the file as a string it works but there are probably cases where it would be useful to not have to do that.
...
command = {
!! tar_knitr_deps_expr("report.Rmd")
rmarkdown::render_site(rmd_file)
"report.html"
}
...
from tarchetypes.
Thanks so much for the new Quarto integration with #98 —it's so easy and intuitive!
Quick question about tar_quarto()
—is there a way to add dynamic branches to pick up on all the source files for a Quarto website and show them as unique targets in the network graph (or general list of targets)? Right now, when using a Quarto website, the entire website shows up as a single node in the graph, and it might be helpful to show the individual .qmd files.
Here's a reprex:
# Create a simple Quarto website
targets::tar_dir({
lines <- c(
"---",
"title: index",
"output_format: html",
"---",
"A home page"
)
writeLines(lines, "index.qmd")
lines <- c(
"project:",
" type: website",
"",
"website:",
" title: 'website'",
" navbar:",
" left:",
" - index.qmd",
" - report1.qmd",
" - report2.qmd",
"",
"format:",
" html:",
" theme: cosmo"
)
writeLines(lines, "_quarto.yml")
lines <- c(
"---",
"title: report1.qmd source file",
"output_format: html",
"---",
"Assume these lines are in report1.qmd.",
"```{r}",
"targets::tar_read(data1)",
"```"
)
writeLines(lines, "report1.qmd")
lines <- c(
"---",
"title: report2.qmd source file",
"output_format: html",
"---",
"Assume these lines are in report2.qmd.",
"```{r}",
"targets::tar_read(data2)",
"```"
)
writeLines(lines, "report2.qmd")
# Define a workflow that uses tar_quarto()
targets::tar_script({
library(tarchetypes)
list(
tar_target(data1, data.frame(x = seq_len(26), y = letters)),
tar_target(data2, data.frame(x = seq_len(26), y = LETTERS)),
tar_quarto(website, path = ".")
)
}, ask = FALSE)
targets::tar_visnetwork()
})
With that folder structure and pipeline, this is the network it generates:
If I edit report1.qmd
or report2.qmd
individually (not in a temporary tar_dir
), targets shows that the whole website node is outdated and rebuilds the whole site. By adding a freeze setting to _quarto.yml
, it's possible to only re-render the modified pages (see https://quarto.org/docs/projects/code-execution.html#freeze):
execute:
freeze: auto # re-render only when source changes
…so extra computation time can be prevented on Quarto's end, but on targets's end, it doesn't register that only one file in the website project directory has changed.
tarchetypes::tar_quarto_files()
lists the .qmd sources in the website:
#> tarchetypes::tar_quarto_files()
$sources
[1] "index.qmd" "report1.qmd" "report2.qmd"
$output
[1] "_site"
$input
[1] "_quarto.yml"
Is there a way to use that list of sources to create dynamic branches so that each .qmd file shows up in the network graph, or is it not possible because using tar_quarto()
on a website project entails re-rendering the whole website, so it doesn't matter if an individual .qmd file has been edited?
Is the better approach to do what @tjmahr does with notestar or @jdtrat does with bookdown here and generate a list of .qmd files that gets fed to tar_target(..., file = TRUE)
?
Thanks!
from tarchetypes.
Got it, that makes given Quarto's project-level rendering (and additional time can be saved with Quarto's freeze capabilities). Thanks!
from tarchetypes.
Related Issues (20)
- Local persistence of cloud-backed file targets HOT 5
- `tar_quarto()` always ends normally for quarto project even if there is error HOT 3
- `tar_quarto()` ignores `output_dir` in _quarto.yml when passed an individual file
- combine tar_cue_age with a conditional statement HOT 4
- Rep-specific seeds in tar_rep(), tar_map_rep(), etc. HOT 5
- optional garbage collection between reps of the `tar_rep*()` functions HOT 1
- tar_change repository not considered for change part
- Branches not in metadata: branches out of range
- GitHub interactions are temporarily limited because the maintainer is out of office.
- tar_cross() HOT 2
- Bug: `tar_quarto_rep()` throws an error if used together with `future::plan()` from _targets.R template HOT 1
- Support Quarto profiles? HOT 10
- Expose `tar_render()`, `tar_quarto()` and similar functions to the `deps` argument of `tar_target_raw()` HOT 8
- Errors and warnings with Quarto
- tar_quarto_rep doesn't work on reports in subdirectories HOT 2
- `retrieval = "none"` in quarto target factories HOT 2
- [general] Use `tar_rep()` and `tar_rep2()` inside of `tar_map()` HOT 2
- Allow trailing comma in `tar_map()` HOT 1
- Let `tar_map()` substitute more fields, e.g., `priority` HOT 2
- Safely allow tar_quarto() etc. to run the report from a custom working directory HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tarchetypes.