Giter Site home page Giter Site logo

jeromyanglim / rmarkdown-rmeetup-2012 Goto Github PK

View Code? Open in Web Editor NEW
26.0 7.0 38.0 2.04 MB

Reproducible analysis with knitr, R Markdown, and RStudio: Slides and example R Markdown files from the presentation

Home Page: http://jeromyanglim.blogspot.com

rmarkdown-rmeetup-2012's Introduction

Overview of repository

The video for this talk can be viewed here on YouTube.

This repository includes the files related to the talk:

  • PDF of Slides
  • See the examples directory for four R Markdown examples.
  • While developing the presentation I tried to follow principles of open science by doing a lot of my thinking in a public way. This thinking is recorded using the github issue tracker. See the list of issues here
  • The talk itself was a beamer presentation. I used raw latex for the header talk/main.tex and markdown (see talk.md for the body which was converted by pandoc into latex beamer (see talk/makefile).

Abstract of talk

Simple Reproducible Analysis with knitr, R Markdown, and RStudio

Reproducible analysis represents a process for transforming text, code, and data to produce reproducible artefacts including reports, journal articles, slideshows, theses, and books. Reproducible analysis is important in both industry and academic settings for ensuring a high quality product. R has always provided a powerful platform for reproducible analysis. However, in the first half of 2012, several new tools have emerged that have substantially increased the ease with which reproducible analysis can be performed. In particular, knitr, R Markdown, and RStudio combine to create a user-friendly and powerful set of open source tools for reproducible analysis.

This talk will provide an introduction to Markdown and using knitr and R Markdown to produce reproducible reports. In partiuclar, it will discuss caching slow analyses, producing attractive plots and tables, and using RStudio as an IDE. The talk will also show how the markdown package on CRAN can be used to work with other R development environments and workflows for report production. If time permits, the talk may touch on how knitr can be used with other markup languages including LaTeX and HTML.

Jeromy Anglim is a Post Doctoral Fellow in the Melbourne Business School. He has spent the last 10 years teaching statistics in university settings. He has also worked as a statistical consultant in market research, selection and recruitment, and organisational climate survey research. He has been using R for the last five years and regularly blogs about the power of R at http://jeromyanglim.blogspot.com

Relevant links:

Suggestions

I sent out an online query about things that people might like me to cover:

  • interested in a practical talk on using Sweave with R, both to be able to produce quality tables, and reports that can be updated with new data (ie, like monthly reports)
  • Also, if you could touch on latex, as R users who don't have an academic background won't be familiar with this.
  • From a practical perspective, knowing any issues to install either sweave or latex would be useful.
  • "I would like to know how to integrate LaTex and R or exporting R results for writing reports or journal papers."
  • "I'd be interested in hearing of any experiences you've had with creating reproducible presentations, as well as reports. I recently used the slidify package (http://ramnathv.github.com/slidify/) and would like to learn more."
  • "RStudio as an IDE sounds interesting"
  • "Text manipulation in R. Sed? Awk? Perl? Grep?"
  • "I would be interested in automate LaTex to web enabled form of reporting"
  • "How large a dataset can knitr deal with?"
  • "I am interested in how to use R to automate marketing operations reports including campaign reporting. Particularly if there is an ability to integrate through to MS Office products for commentary and end reporting"
  • "Best way of producing reproducible and visually appealing slideshows."
  • "integration of Knitr with RStudio, so I'm interested in learning more about that"
  • integrate LaTex and R or exporting R results for writing reports or journal papers.
  • I'm interested in how he uses RStudio & the applications of his reproducible research (how does it get published? does he publish the whole report with source? Share with his peers? Just remember it for later.
  • I would be interested in experience-based best practices for keeping analyses organised as well as reproducible. Any tips about streamlining the re-use of analyses with different data sets would be good too.
  • Latex
  • I'm interested in reproducible analysis because it seems to offer the ability to complete more experiments in less time. I'm most interested in the planning that goes into it and how you would still have the flexibility to follow results.
  • Macros
  • Any experience integrating with the new generation of javascript visualisation tools like d3?
  • workflow and good practice (e.g. spreading source code over multiple files vs single file) in using Srstudio and markdown as an IDE, also in integrating them with unit testing (e.g. Hadley Wickham's testthat package).

rmarkdown-rmeetup-2012's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

rmarkdown-rmeetup-2012's Issues

markdown to beamer-pdf or markdown to latex to post processing to beamer-pdf

I really like the minimum fuss of markdown to beamer-pdf. However, there is the tension that sooner or later you don't like the default choices made or you want to take advantage of powerful features.

Thus,

  • What is the best way to override default features?
  • What is the best way to use non-standard options?

To a certain extent latex will be just passed through, but sometimes you want to override the default behaviour of Markdon to LaTeX.
Also, sometimes I want to add header information.

How to show images in Markdown generated from R Markdown on github?

I have a few github repositories that have multiple R Markdown files with each R Markdown file in a separate folder.

I want to be able upload these repositories and I want the images to display when someone clicks on a Markdown file.

At the moment, it's accessing the blob version and not the raw version, which is causing issues.

I asked about a general solution to the problem on Stack Overflow.

A general solution is to change the base.url setting

opts_knit$set(base.url='https://github.com/.../raw/.../')

For one particular file I wrote the following:

```{r echo = FALSE}
github_baseurl <- 'https://github.com/jeromyanglim/gelman-bayesian-data-analysis/raw/master/'

filepath <- strsplit(getwd(), '/')[[1]]
# assumes that markdown file is stored in a folder below master
markdown_folder <- filepath[length(filepath)]
image_base_url <- paste0(github_baseurl, markdown_folder, '/')

opts_knit$set(base.url=image_base_url)
```

However, this needs to be done at the very end otherwise preliminary compilations will not display properly on the local computer because the images are not available on github.

Thus, I'm looking for a general solution to this problem.

This also links into my general need to have a single makefile that will convert rmd files to md files in all folders of a repo with github friendly images.

An argument for not using reproducible data analysis tools like knitr, Sweave, etc.?

Clearly most researchers don't anlayse their data with reproducible data analysis tools like knitr and Sweave.

  • Are there good reasons for not using fully reproducible data analysis tools?
  • What are the counter-arguments?
  • What can an analysis of not using such tools tell us about the obstacles?

For practical purposes I operationalise reproducible analysis as:

  • a one-click build
  • code performs all data transformations and analyses
  • all data and necessary metadata is provided
  • statistical output is automatically incorporated into the final report

knitr or sweave with R and LaTeX and a build script such as a makefile shared as a self-contained archive file is one way of satisfying the above criteria.

Setting up a knitr analysis on a fresh Windows install

I run Linux, but I sometimes need to give an R script to someone to run an automated production process involving knitr that is going to run on Windows. The user of the script does not necessarily no much about R and their machine does not come pre-configured for R or knitr.

What steps need to be taken?

What are the degrees of reproducible data analysis?

I was wanting to conceptualise reproducible data analysis in a broader context.

  • What are the different ways that reproducible data analysis can be achieved?
  • How do such degrees relate to achieving the aims of reproducible analysis?
  • What are the different aspects of reproducible data analysis?

Diagnosing pandoc error messages and combining latex with pandoc

I issue this command:

$ pandoc -o talk.tex talk.md  

and I get this error message

pandoc: 
Error:
"source" (line 112, column 1):
unexpected end of input
expecting "\\" or "$"

It seems to emerge once I add

\begin{document}

to my pandoc file. How can this be fixed?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.