Giter Site home page Giter Site logo

Comments (18)

wlandau avatar wlandau commented on May 29, 2024 3

To render the individual documents, you could use static branching with tar_eval() to create a tar_quarto() target for each qmd file. Something like:

# _targets.R
# ...
list(
  tar_target(...),
  tar_eval(
    tar_quarto(name, report),
    values = list(name = rlang::syms(c("report1", "report2")), report = c("report1.qmd", "report2.qmd"))
  )
)

For projects, tar_quarto() runs a single quarto::quarto_render() call for the entire project, and Quarto may do more than simply render qmd and Rmd files. Any extra steps done by Quarto, such as rendering of bibliographies and Jupyter notebooks in addition to Rmd files, are the responsibility of that single quarto_render() call, and tarchetypes would not be able to replicate them reliably.

In most cases, I don't think it is too much of a burden to render all the documents together. It is best if the pipeline does long-running tasks in upstream targets rather than the code chunks in the qmd files, and this modularity should allow tar_quarto() targets to run quickly even if no reports are skipped.

Is the better approach to do what @tjmahr does with notestar or @jdtrat #23 (comment) and generate a list of .qmd files that gets fed to tar_target(..., file = TRUE)?

Because of the flexible project functionality in Quarto, I was hoping tar_quarto() would avoid the need for low-level workarounds that require users to manually call tar_knitr_deps(). Those workarounds are still possible, but I don't think they are as necessary unless you have a use case where rendering the reports in a project is unavoidably time-consuming.

from tarchetypes.

wlandau avatar wlandau commented on May 29, 2024 2

I think we agree the immediate advantage of tar_knitr_deps_expr() is to help advanced users write their own dependency-aware R Markdown targets, whether they get rendered through bookdown::render_book(), rmarkdown::render_site(), or something else. That should buy us some time.

However, looking long term, I do not think this is the final answer. Most users get confused when I talk about static code analysis and dependency relationships, and I do not expect them to understand the issue you raised. For the use cases that become common enough, potentially bookdown and rmarkdown sites, we can write some new user-friendly target factories (e.g. tar_bookdown(), like tar_render() for bookdown projects). This approach could lead to a lot of new functions and ultimately one or more new targetopia packages for literate programming (tarmarkdown?). If that happens, we could move the tarchetypes literate programming stuff there and re-export it back for compatibility.

My head is spinning with targetopia ideas, I should start writing them down somewhere public.

@jdblischak, do you think workflowr could benefit from its own tar_render()-like target factory?

from tarchetypes.

wlandau avatar wlandau commented on May 29, 2024 2

Building some sort of pipeline directly into wflow_build() and wflow_publish() does sound handy, though this seems to have a slightly different set of goals and opinions than targets has. For targets, an R Markdown report is a document that depends on upstream targets and renders quickly. The intent is to summarize the hard work already done inside the targets themselves and do very little work in the actual report, so targets does not really think about R Markdown reports that depend on each other (1-data.Rmd, 2-model.Rmd, etc.). I was thinking that if any or all of these documents happen to use targets, there could be some way to capture that pattern and make it easier, but potentially not include steps that commit or publish downstream.

from tarchetypes.

wlandau avatar wlandau commented on May 29, 2024 1

The code !!tar_knitr_deps_expr(rmd_file) is evaluated when the target is defined, not when the target runs. So as you found out, the pipeline needs !!tar_knitr_deps_expr("report.Rmd") instead.

there are probably cases where it would be useful to not have to do that.

This is a permanent limitation: targets needs to know the dependency structure in advance so it can run the correct targets in the correct order and invalidate the correct ones when dependencies change. I feel this is a reasonable constraint because most users write their R Markdown source by hand before running it.

from tarchetypes.

wlandau avatar wlandau commented on May 29, 2024 1

Possibly related to the discussion around workflowr is the new tar_github_actions() function in dev, which writes a workflow file to run a targets pipeline on GitHub Actions and upload historical runs to a special targets-runs branch. Demonstrated at https://github.com/wlandau/targets-minimal. Maybe this creates new opportunities?

from tarchetypes.

jdtrat avatar jdtrat commented on May 29, 2024 1

Hiya @wlandau! I'm getting into some serious writing for my master's thesis. The analyses were done with targets (again, thank you so much for such an incredible package!) and I wanted to incorporate the pipeline with bookdown. I adapted tarchetypes::tar_render() into tar_render_book() that follows a similar form. It seems to work pretty well for my purposes so I thought I'd see if there was any interest in making it more robust/supported?

I have an idea of the limitations of my implementation (and think @tjmahr's notestar package does an excellent job at addressing many of them!). I made a demo project at jdtrat/tar-render-book-demo with the adaptations I made. I'd love to contribute to the targetopia (after my thesis is submitted😅) and would value any feedback!

from tarchetypes.

wlandau avatar wlandau commented on May 29, 2024 1

Great work, @jdtrat! At a glance it looks promising. Really happy to see that someone took this on. Would be happy to loop back when you are ready to package up what you have. Good luck with your thesis!

from tarchetypes.

ginolhac avatar ginolhac commented on May 29, 2024

Hello,

I am also using render_site() and I am using Garrick trick here: rstudio/gt#297 (comment)

# callr by gabenbuie https://github.com/rstudio/gt/issues/297#issuecomment-497778735
  callr::r(function(...) rmarkdown::render_site(...),
           args = list(input = Rmd,
                       output_format = "xaringan::moon_reader",
                       quiet = FALSE)
  )

hope this helps

from tarchetypes.

wlandau avatar wlandau commented on May 29, 2024

Good idea, @tjmahr. Now implemented.

library(tarchetypes)
lines1 <- c(
  "---",
  "title: report",
  "output_format: html_document",
  "---",
  "",
  "```{r}",
  "tar_load(data1)",
  "tar_read(data2)",
  "```"
)
lines2 <- c(
  "---",
  "title: report",
  "output_format: html_document",
  "---",
  "",
  "```{r}",
  "tar_load(data2)",
  "tar_read(data3)",
  "```"
)
report1 <- tempfile()
report2 <- tempfile()
writeLines(lines1, report1)
writeLines(lines2, report2)

tar_knitr_deps(c(report1, report2))
#> [1] "data1" "data2" "data3"

tar_knitr_deps_expr(c(report1, report2))
#> list(data1, data2, data3)

Created on 2020-12-18 by the reprex package (v0.3.0)

from tarchetypes.

wlandau avatar wlandau commented on May 29, 2024

To be clear, I don’t think there is a huge rush to implement tarmarkdown right away. There is plenty of space in tarchetypes in the short and medium term to act as a proving ground. And depending on the use cases we find, it may or may not make sense to bud off a new package once we have enough concrete material.

from tarchetypes.

jdblischak avatar jdblischak commented on May 29, 2024

do you think workflowr could benefit from its own tar_render()-like target factory?

@wlandau I am definitely open to the possibility of integrating targets into workflowr. Support for pipelines based on file dependencies has been one of the most requested features (see discussion in workflowr/workflowr#9), but I have be hesitant to implement it since this is a huge feature. Using a dedicated pipeline tool like targets enable me to focus only on the workflowr-specific aspects.

A big issue is that workflowr recognizes 2 different kinds of "outdated" files. The first, wflow_build(), is based on file modifications (similar to Make and targets), and it used when developing an analysis. The second, wflow_publish(), is for when the analysis is ready to be committed and then added to the website. This is based on the git status of the files, not the file modification time. Thus any solution for workflowr would have to distinguish between these two types of tracking the files.

I need to better investigate the current features provided by targets. I'd appreciate any advice on which features to focus on.

from tarchetypes.

jdblischak avatar jdblischak commented on May 29, 2024

this seems to have a slightly different set of goals and opinions than targets has.

Agreed. The framework for organizing the computation is different. There's no need to force the integration. If it's possible to combine workflowr/targets, that'd be nice. But at minimum, by investigating a potential integration, I'll learn from your implementation.

from tarchetypes.

tjmahr avatar tjmahr commented on May 29, 2024

I am trying this out in practice, but I am not sure if I am missing something. The example doesn't show how one would use the new functions in a tar_target(). I am trying

list(
  ...,
  tar_target(notebook_deps, tar_knitr_deps_expr(notebook_rmd_files)),
  tar_target(
    notebook,
    command = {
      deps = notebook_deps
      other_deps <- list(
        notebook_bookdown_yaml_file,
        notebook_output_yaml_file
      )
      rmarkdown::render_site("notebook", encoding = 'UTF-8')
      "notebook/docs/notebook.html"
    },
    format = "file"
)

but tar_visnetwork() is not connecting any targets to the files.

tar_knitr_deps_expr(notebook_rmd_files) is returning a list of symbols of target names, so that part is working.

Am I missing something about the evaluation?

from tarchetypes.

wlandau avatar wlandau commented on May 29, 2024

Looks almost right, but I would check your commands with tar_manifest() to be sure. An easier way to do the metaprogramming is with tidy evaluation

tar_target(
  notebook,
  command = {
    !!tar_knitr_deps_expr(notebook_rmd_files) # note the "!!"
    other_deps <- list(
      notebook_bookdown_yaml_file,
      notebook_output_yaml_file
    )
    rmarkdown::render_site("notebook", encoding = 'UTF-8')
    "notebook/docs/notebook.html"
  },
  format = "file"
)

But taking a step back, your workflow is file-based, so you are left having to fight the variable-based behavior that I designed targets for. So you will probably have an easier time taking manual control of the target's dependencies. You can do this with the deps argument of tar_target_raw(). (Although you will have to update targets because I just had to fix a bug to get this to work.)

tar_target_raw(
  "notebook",
  command = quote({
    rmarkdown::render_site("notebook", encoding = 'UTF-8')
    "notebook/docs/notebook.html"
  }),
  format = "file",
  deps = tar_knitr_deps(notebook_rmd_files) # Returns a character vector of dependency names.
)

from tarchetypes.

wlandau avatar wlandau commented on May 29, 2024

To give you a better idea:

library(targets)
tar_script({
  x <- 1
  tar_target_raw("y", quote(1))
})

tar_visnetwork()

tar_script({
  x <- 1
  tar_target_raw("y", quote(1), deps = "x")
})

tar_visnetwork()

Created on 2021-01-07 by the reprex package (v0.3.0)

from tarchetypes.

lazappi avatar lazappi commented on May 29, 2024

Hi

I have been struggling to get this to work with rmarkdown::render_site() for a single file but I keep getting an object not found error. I think there might need to be some extra quoting or something but can't quite work it work out. Any help would be great!

Here is what I have been trying:

list(
    tar_target(
        rmd_file,
        "report.Rmd",
        format = "file"
    ),
    tar_target(
        html_file,
        command = {
            !! tar_knitr_deps_expr(rmd_file)
            rmarkdown::render_site(rmd_file)
            "report.html"
        },
        format = "file"
    )
)
> targets::tar_visnetwork()
Error in file.exists(path) : object 'rmd_file' not found

If you manually give the path to the file as a string it works but there are probably cases where it would be useful to not have to do that.

...
command = {
    !! tar_knitr_deps_expr("report.Rmd")
    rmarkdown::render_site(rmd_file)
    "report.html"
}
...

from tarchetypes.

andrewheiss avatar andrewheiss commented on May 29, 2024

Thanks so much for the new Quarto integration with #98 —it's so easy and intuitive!

Quick question about tar_quarto()—is there a way to add dynamic branches to pick up on all the source files for a Quarto website and show them as unique targets in the network graph (or general list of targets)? Right now, when using a Quarto website, the entire website shows up as a single node in the graph, and it might be helpful to show the individual .qmd files.

Here's a reprex:

# Create a simple Quarto website
targets::tar_dir({
  lines <- c(
    "---",
    "title: index",
    "output_format: html",
    "---",
    "A home page"
  )
  writeLines(lines, "index.qmd")
  
  lines <- c(
    "project:",
    "  type: website",
    "",
    "website:",
    "  title: 'website'",
    "  navbar:",
    "    left:",
    "      - index.qmd",
    "      - report1.qmd",
    "      - report2.qmd",
    "",
    "format:",
    "  html:",
    "    theme: cosmo"
  )
  writeLines(lines, "_quarto.yml")
  
  lines <- c(
    "---",
    "title: report1.qmd source file",
    "output_format: html",
    "---",
    "Assume these lines are in report1.qmd.",
    "```{r}",
    "targets::tar_read(data1)",
    "```"
  )
  writeLines(lines, "report1.qmd")
  
  lines <- c(
    "---",
    "title: report2.qmd source file",
    "output_format: html",
    "---",
    "Assume these lines are in report2.qmd.",
    "```{r}",
    "targets::tar_read(data2)",
    "```"
  )
  writeLines(lines, "report2.qmd")
  
# Define a workflow that uses tar_quarto()
  targets::tar_script({
    library(tarchetypes)
    list(
      tar_target(data1, data.frame(x = seq_len(26), y = letters)),
      tar_target(data2, data.frame(x = seq_len(26), y = LETTERS)),
      tar_quarto(website, path = ".")
    )
  }, ask = FALSE)
  
  targets::tar_visnetwork()
})

With that folder structure and pipeline, this is the network it generates:

image

If I edit report1.qmd or report2.qmd individually (not in a temporary tar_dir), targets shows that the whole website node is outdated and rebuilds the whole site. By adding a freeze setting to _quarto.yml, it's possible to only re-render the modified pages (see https://quarto.org/docs/projects/code-execution.html#freeze):

execute:
  freeze: auto  # re-render only when source changes

…so extra computation time can be prevented on Quarto's end, but on targets's end, it doesn't register that only one file in the website project directory has changed.

tarchetypes::tar_quarto_files() lists the .qmd sources in the website:

#> tarchetypes::tar_quarto_files()
$sources
[1] "index.qmd"   "report1.qmd" "report2.qmd"

$output
[1] "_site"

$input
[1] "_quarto.yml"

Is there a way to use that list of sources to create dynamic branches so that each .qmd file shows up in the network graph, or is it not possible because using tar_quarto() on a website project entails re-rendering the whole website, so it doesn't matter if an individual .qmd file has been edited?

Is the better approach to do what @tjmahr does with notestar or @jdtrat does with bookdown here and generate a list of .qmd files that gets fed to tar_target(..., file = TRUE)?

Thanks!

from tarchetypes.

andrewheiss avatar andrewheiss commented on May 29, 2024

Got it, that makes given Quarto's project-level rendering (and additional time can be saved with Quarto's freeze capabilities). Thanks!

from tarchetypes.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.